AIセキュリティポータルbot

Involuntary In-Context Learning: Exploiting Few-Shot Pattern Completion to Bypass Safety Alignment in GPT-5.4

Authors: Alex Polyakov, Daniel Kuznetsov | Published: 2026-04-21
データ毒性
プロンプトリーキング
安全性アライメント

Malicious ML Model Detection by Learning Dynamic Behaviors

Authors: Sarang Nambiar, Dhruv Pradhan, Ezekiel Soremekun | Published: 2026-04-21
モデル抽出攻撃
動的アクセス制御
異常検出手法

Do Agents Dream of Root Shells? Partial-Credit Evaluation of LLM Agents in Capture The Flag Challenges

Authors: Ali Al-Kaswan, Maksim Plotnikov, Maxim Hájek, Roland Vízner, Arie van Deursen, Maliheh Izadi | Published: 2026-04-21
LLM性能評価
プロンプトリーキング
自動評価手法

DP-FlogTinyLLM: Differentially private federated log anomaly detection using Tiny LLMs

Authors: Isaiah Thompson, Tanmay Sen, Ritwik Bhattacharya | Published: 2026-04-21
LLM性能評価
異常検出手法
重み更新手法

ProjLens: Unveiling the Role of Projectors in Multimodal Model Safety

Authors: Kun Wang, Cheng Qian, Miao Yu, Lilan Peng, Liang Lin, Jiaming Zhang, Tianyu Zhang, Yu Cheng, Yang Wang | Published: 2026-04-21
インダイレクトプロンプトインジェクション
データ毒性
ポイズニング攻撃

SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection

Authors: Zhengyang Shan, Xu Qian, Jiayun Xin, Minghui Xu, Yue Zhang, Zhen Yang, Hao Wu, Xiuzhen Cheng | Published: 2026-04-21
LLM性能評価
プロンプトインジェクション
一般化性能

Beyond Pattern Matching: Seven Cross-Domain Techniques for Prompt Injection Detection

Authors: Thamilvendhan Munirathinam | Published: 2026-04-20
インダイレクトプロンプトインジェクション
自然言語処理
防御手法

AgenTEE: Confidential LLM Agent Execution on Edge Devices

Authors: Sina Abdollahi, Mohammad M Maheri, Javad Forough, Amir Al Sadi, Josh Millar, David Kotz, Marios Kogias, Hamed Haddadi | Published: 2026-04-20
インダイレクトプロンプトインジェクション
データ保護手法
プライバシー保護手法

RAVEN: Retrieval-Augmented Vulnerability Exploration Network for Memory Corruption Analysis in User Code and Binary Programs

Authors: Parteek Jamwal, Minghao Shao, Boyuan Chen, Achyuta Muthuvelan, Asini Subanya, Boubacar Ballo, Kashish Satija, Mariam Shafey, Mohamed Mahmoud, Moncif Dahaji Bouffi, Pasindu Wickramasinghe, Siyona Goel, Yaakulya Sabbani, Hakim Hacid, Mthandazo Ndhlovu, Eleanna Kafeza, Sanjay Rawat, Muhammad Shafique | Published: 2026-04-20
LLM性能評価
RAG
RAGへのポイズニング攻撃

TitanCA: Lessons from Orchestrating LLM Agents to Discover 100+ CVEs

Authors: Ting Zhang, Yikun Li, Chengran Yang, Ratnadira Widyasari, Yue Liu, Ngoc Tan Bui, Phuc Thanh Nguyen, Yan Naing Tun, Ivana Clairine Irsan, Huu Hung Nguyen, Huihui Huang, Jinfeng Jiang, Lwin Khin Shar, Eng Lieh Ouh, David Lo, Hong Jin Kang, Yide Yin, Wen Bin Leow | Published: 2026-04-20
LLM性能評価
インダイレクトプロンプトインジェクション
機械学習の応用