文献データベース

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Authors: Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe | Published: 2023-12-07
LLMセキュリティ
サイバーセキュリティ
プロンプトインジェクション

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Authors: Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa | Published: 2023-12-07
アライメント
データ生成手法
リスク分析手法

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Authors: Vasisht Duddu, Sebastian Szyller, N. Asokan | Published: 2023-12-07 | Updated: 2024-04-04
ウォーターマーキング
敵対的サンプル
過剰適合と記憶化

Privacy-preserving quantum federated learning via gradient hiding

Authors: Changhao Li, Niraj Kumar, Zhixin Song, Shouvanik Chakrabarti, Marco Pistoia | Published: 2023-12-07
通信効率
連合学習
量子機械学習

MediHunt: A Network Forensics Framework for Medical IoT Devices

Authors: Ayushi Mishra, Tej Kiran Boppana, Priyanka Bagade | Published: 2023-12-07
ネットワーク脅威検出
侵入検知システム
医療IoTの進展

Defense against ML-based Power Side-channel Attacks on DNN Accelerators with Adversarial Attacks

Authors: Xiaobei Yan, Chip Hong Chang, Tianwei Zhang | Published: 2023-12-07
ウォーターマーキング
防御手法

Understanding (Un)Intended Memorization in Text-to-Image Generative Models

Authors: Ali Naseh, Jaechul Roh, Amir Houmansadr | Published: 2023-12-06
AIの進化
ウォーターマーキング
サイバーセキュリティ

Dr. Jekyll and Mr. Hyde: Two Faces of LLMs

Authors: Matteo Gioele Collu, Tom Janssen-Groesbeek, Stefanos Koffas, Mauro Conti, Stjepan Picek | Published: 2023-12-06 | Updated: 2024-10-07
キャラクター役割演技
プロンプトインジェクション
ポイズニング

Feature Analysis of Encrypted Malicious Traffic

Authors: Anish Singh Shekhawat, Fabio Di Troia, Mark Stamp | Published: 2023-12-06
証明書の比率
証明書の比率分析

Low-Cost High-Power Membership Inference Attacks

Authors: Sajjad Zarifzadeh, Philippe Liu, Reza Shokri | Published: 2023-12-06 | Updated: 2024-06-12
メンバーシップ推論
低コストのメンバシップ推論手法
攻撃手法