文献データベース

Developing Assurance Cases for Adversarial Robustness and Regulatory Compliance in LLMs

Authors: Tomas Bueno Momcilovic, Dian Balta, Beat Buesser, Giulio Zizzo, Mark Purcell | Published: 2024-10-04
LLMセキュリティ
プロンプトインジェクション
動的脆弱性管理

An Intelligent Quantum Cyber-Security Framework for Healthcare Data Management

Authors: Kishu Gupta, Deepika Saxena, Pooja Rani, Jitendra Kumar, Aaisha Makkar, Ashutosh Kumar Singh, Chung-Nan Lee | Published: 2024-10-04
プライバシー保護
量子フレームワーク
量子暗号技術

FedCert: Federated Accuracy Certification

Authors: Minh Hieu Nguyen, Huu Tien Nguyen, Trung Thanh Nguyen, Manh Duong Nguyen, Trong Nghia Hoang, Truong Thao Nguyen, Phi Le Nguyen | Published: 2024-10-04
評価手法

Safeguard is a Double-edged Sword: Denial-of-service Attack on Large Language Models

Authors: Qingzhao Zhang, Ziyang Xiong, Z. Morley Mao | Published: 2024-10-03 | Updated: 2024-10-23
プロンプトインジェクション
モデルDoS

Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents

Authors: Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, Yongfeng Zhang | Published: 2024-10-03
バックドア攻撃
プロンプトインジェクション

Encryption-Friendly LLM Architecture

Authors: Donghwan Rho, Taeseong Kim, Minje Park, Jung Woo Kim, Hyunsik Chae, Jung Hee Cheon, Ernest K. Ryu | Published: 2024-10-03
アルゴリズム
実験的検証

Demonstration Attack against In-Context Learning for Code Intelligence

Authors: Yifei Ge, Weisong Sun, Yihang Lou, Chunrong Fang, Yiran Zhang, Yiming Li, Xiaofang Zhang, Yang Liu, Zhihong Zhao, Zhenyu Chen | Published: 2024-10-03
DICE評価手法
コード生成
悪意のあるデモ構築

Optimizing Adaptive Attacks against Content Watermarks for Language Models

Authors: Abdulrahman Diaa, Toluwani Aremu, Nils Lukas | Published: 2024-10-03
LLMセキュリティ
ウォーターマーキング
プロンプトインジェクション

A Watermark for Black-Box Language Models

Authors: Dara Bahri, John Wieting, Dana Alon, Donald Metzler | Published: 2024-10-02
LLM性能評価
ウォーターマーキング
透かし評価

Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct

Authors: Christopher Ackerman, Nina Panickssery | Published: 2024-10-02 | Updated: 2025-01-25
AIによる出力の識別
プロンプティング戦略
自己認識モデル