Enhancing Watermarking Quality for LLMs via Contextual Generation States Awareness

Authors: Peiru Yang, Xintian Li, Wanchun Ni, Jinhua Yin, Huili Wang, Guoshun Nan, Shangguang Wang, Yongfeng Huang, Tao Qi | Published: 2025-06-09

Beyond Jailbreaks: Revealing Stealthier and Broader LLM Security Risks Stemming from Alignment Failures

Authors: Yukai Zhou, Sibei Yang, Wenjie Wang | Published: 2025-06-09

JavelinGuard: Low-Cost Transformer Architectures for LLM Security

Authors: Yash Datta, Sharath Rajasekar | Published: 2025-06-09

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test

Authors: Xiaoyuan Zhu, Yaowen Ye, Tianyi Qiu, Hanlin Zhu, Sijun Tan, Ajraf Mannan, Jonathan Michala, Raluca Ada Popa, Willie Neiswanger | Published: 2025-06-08 | Updated: 2025-06-11

Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation

Authors: Jaechul Roh, Varun Gandhi, Shivani Anilkumar, Arin Garg | Published: 2025-06-08 | Updated: 2025-06-12

TracLLM: A Generic Framework for Attributing Long Context LLMs

Authors: Yanting Wang, Wei Zou, Runpeng Geng, Jinyuan Jia | Published: 2025-06-04

Privacy and Security Threat for OpenAI GPTs

Authors: Wei Wenying, Zhao Kaifa, Xue Lei, Fan Ming | Published: 2025-06-04

Evaluating Apple Intelligence’s Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets

Authors: Mohd. Farhan Israk Soumik, Syed Mhamudul Hasan, Abdur R. Shahid | Published: 2025-06-04

AIを不正に操る、バックドア攻撃

攻撃者が意図したタイミングでAIを不正に操ることを可能にするバックドア攻撃について解説します。

Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

Authors: Avihay Cohen | Published: 2025-06-04