文献データベース

RLSpoofer: A Lightweight Evaluator for LLM Watermark Spoofing Resilience

Authors: Hanbo Huang, Xuan Gong, Yiran Zhang, Hao Zheng, Shiyu Liang | Published: 2026-04-13
攻撃戦略分析
敵対的学習
透かし設計

RedShell: A Generative AI-Based Approach to Ethical Hacking

Authors: Ricardo Bessa, Rui Claro, João Trindade, João Lourenço | Published: 2026-04-13
LLM性能評価
プロンプトインジェクション
攻撃戦略分析

Mobile GUI Agent Privacy Personalization with Trajectory Induced Preference Optimization

Authors: Zhixin Lin, Jungang Li, Dongliang Xu, Shidong Pan, Yibo Shi, Yuchi Liu, Yuecong Min, Yue Yao | Published: 2026-04-13
アライメント
プライバシー管理
透かし設計

QShield: Securing Neural Networks Against Adversarial Attacks using Quantum Circuits

Authors: Navid Azimi, Aditya Prakash, Yao Wang, Li Xiong | Published: 2026-04-13
モデルの頑健性保証
透かし設計
量子フレームワーク

Beyond A Fixed Seal: Adaptive Stealing Watermark in Large Language Models

Authors: Shuhao Zhang, Yuli Chen, Jiale Han, Bo Cheng, Jiabao Ma | Published: 2026-04-13
モデル抽出攻撃
攻撃戦略分析
透かし設計

Vulnerability Detection with Interprocedural Context in Multiple Languages: Assessing Effectiveness and Cost of Modern LLMs

Authors: Kevin Lira, Baldoino Fonseca, Davy Baía, Márcio Ribeiro, Wesley K. G. Assunção | Published: 2026-04-09
LLM性能評価
データ駆動型脆弱性評価
プロンプトインジェクション

Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

Authors: Hanzhi Liu, Chaofan Shou, Hongbo Wen, Yanju Chen, Ryan Jingyang Fang, Yu Feng | Published: 2026-04-09
インダイレクトプロンプトインジェクション
データ毒性攻撃
攻撃戦略分析

Securing Retrieval-Augmented Generation: A Taxonomy of Attacks, Defenses, and Future Directions

Authors: Yuming Xu, Mingtao Zhang, Zhuohan Ge, Haoyang Li, Nicole Hu, Jason Chen Zhang, Qing Li, Lei Chen | Published: 2026-04-09
RAG
RAGへのポイズニング攻撃
プライバシー管理

Towards Identification and Intervention of Safety-Critical Parameters in Large Language Models

Authors: Weiwei Qi, Zefeng Wu, Tianhang Zheng, Zikang Zhang, Xiaojun Jia, Zhan Qin, Kui Ren | Published: 2026-04-09
プロンプトインジェクション
モデル性能評価
安全性評価

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

Authors: Rui Zhang, Hongwei Li, Yun Shen, Xinyue Shen, Wenbo Jiang, Guowen Xu, Yang Liu, Michael Backes, Yang Zhang | Published: 2026-04-09
LLM性能評価
出力の有害度の算出
安全性評価