AIセキュリティポータルbot

CyberGym: Evaluating AI Agents’ Cybersecurity Capabilities with Real-World Vulnerabilities at Scale

Authors: Zhun Wang, Tianneng Shi, Jingxuan He, Matthew Cai, Jialin Zhang, Dawn Song | Published: 2025-06-03
プロンプトインジェクション
動的分析手法
透かし評価

Attention Knows Whom to Trust: Attention-based Trust Management for LLM Multi-Agent Systems

Authors: Pengfei He, Zhenwei Dai, Xianfeng Tang, Yue Xing, Hui Liu, Jingying Zeng, Qiankun Peng, Shrivats Agrawal, Samarth Varshney, Suhang Wang, Jiliang Tang, Qi He | Published: 2025-06-03
インダイレクトプロンプトインジェクション
モデルDoS
倫理的考慮

BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage

Authors: Kalyan Nakka, Nitesh Saxena | Published: 2025-06-03
LLMの安全機構の解除
フィッシング攻撃の検出率
プロンプトインジェクション

A Review of Various Datasets for Machine Learning Algorithm-Based Intrusion Detection System: Advances and Challenges

Authors: Sudhanshu Sekhar Tripathy, Bichitrananda Behera | Published: 2025-06-03
トリガーの検知
侵入検知システム
検出手法の分析

MISLEADER: Defending against Model Extraction with Ensembles of Distilled Models

Authors: Xueqi Cheng, Minxing Zheng, Shixiang Zhu, Yushun Dong | Published: 2025-06-03
モデル抽出攻撃
モデル抽出攻撃の検知
防御手法

On the Stability of Graph Convolutional Neural Networks: A Probabilistic Perspective

Authors: Ning Zhang, Henry Kenlay, Li Zhang, Mihai Cucuringu, Xiaowen Dong | Published: 2025-06-01 | Updated: 2025-06-03
動的グラフ処理
敵対的学習
最適化問題

A Systematic Review of Metaheuristics-Based and Machine Learning-Driven Intrusion Detection Systems in IoT

Authors: Mohammad Shamim Ahsan, Salekul Islam, Swakkhar Shatabda | Published: 2025-05-31 | Updated: 2025-06-03
プロンプトインジェクション
侵入検知システム
最適化アルゴリズムの選択と評価

MCP Safety Training: Learning to Refuse Falsely Benign MCP Exploits using Improved Preference Alignment

Authors: John Halloran | Published: 2025-05-29
RAGへのポイズニング攻撃
アライメント
料理材料

Merge Hijacking: Backdoor Attacks to Model Merging of Large Language Models

Authors: Zenghui Yuan, Yangming Xu, Jiawen Shi, Pan Zhou, Lichao Sun | Published: 2025-05-29
LLMセキュリティ
ポイズニング攻撃
モデル保護手法

Disrupting Vision-Language Model-Driven Navigation Services via Adversarial Object Fusion

Authors: Chunlong Xie, Jialing He, Shangwei Guo, Jiacheng Wang, Shudong Zhang, Tianwei Zhang, Tao Xiang | Published: 2025-05-29
アライメント
敵対的オブジェクト生成
最適化手法