Prompt Infection: LLM-to-LLM Prompt Injection within Multi-Agent Systems Authors: Donghyun Lee, Mo Tiwari | Published: 2024-10-09 2024.10.09 2025.04.03 文献データベース
FreqMark: Frequency-Based Watermark for Sentence-Level Detection of LLM-Generated Text Authors: Zhenyu Xu, Kun Zhang, Victor S. Sheng | Published: 2024-10-09 2024.10.09 2025.04.03 文献データベース
Signal Watermark on Large Language Models Authors: Zhenyu Xu, Victor S. Sheng | Published: 2024-10-09 2024.10.09 2025.04.03 文献データベース
Hallucinating AI Hijacking Attack: Large Language Models and Malicious Code Recommenders Authors: David Noever, Forrest McKee | Published: 2024-10-09 2024.10.09 2025.04.03 文献データベース
Near Exact Privacy Amplification for Matrix Mechanisms Authors: Christopher A. Choquette-Choo, Arun Ganesh, Saminul Haque, Thomas Steinke, Abhradeep Thakurta | Published: 2024-10-08 | Updated: 2025-03-20 2024.10.08 2025.04.03 文献データベース
KnowledgeSG: Privacy-Preserving Synthetic Text Generation with Knowledge Distillation from Server Authors: Wenhao Wang, Xiaoyu Liang, Rui Ye, Jingyi Chai, Siheng Chen, Yanfeng Wang | Published: 2024-10-08 | Updated: 2024-10-10 2024.10.08 2025.04.03 文献データベース
Superficial Safety Alignment Hypothesis Authors: Jianwei Li, Jung-Eun Kim | Published: 2024-10-07 2024.10.07 2025.04.03 文献データベース
SecAlign: Defending Against Prompt Injection with Preference Optimization Authors: Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, Chuan Guo | Published: 2024-10-07 | Updated: 2025-01-13 2024.10.07 2025.04.03 文献データベース
LOTOS: Layer-wise Orthogonalization for Training Robust Ensembles Authors: Ali Ebrahimpour-Boroojeny, Hari Sundaram, Varun Chandrasekaran | Published: 2024-10-07 2024.10.07 2025.04.03 文献データベース
FRIDA: Free-Rider Detection using Privacy Attacks Authors: Pol G. Recasens, Ádám Horváth, Alberto Gutierrez-Torre, Jordi Torres, Josep Ll. Berral, Balázs Pejó | Published: 2024-10-07 2024.10.07 2025.04.03 文献データベース