DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial Natural Language Instructions

Authors: Fangzhou Wu, Xiaogeng Liu, Chaowei Xiao | Published: 2023-12-07 | Updated: 2023-12-12

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Authors: Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe | Published: 2023-12-07

Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations

Authors: Hakan Inan, Kartikeya Upasani, Jianfeng Chi, Rashi Rungta, Krithika Iyer, Yuning Mao, Michael Tontchev, Qing Hu, Brian Fuller, Davide Testuggine, Madian Khabsa | Published: 2023-12-07

SoK: Unintended Interactions among Machine Learning Defenses and Risks

Authors: Vasisht Duddu, Sebastian Szyller, N. Asokan | Published: 2023-12-07 | Updated: 2024-04-04

Privacy-preserving quantum federated learning via gradient hiding

Authors: Changhao Li, Niraj Kumar, Zhixin Song, Shouvanik Chakrabarti, Marco Pistoia | Published: 2023-12-07

MediHunt: A Network Forensics Framework for Medical IoT Devices

Authors: Ayushi Mishra, Tej Kiran Boppana, Priyanka Bagade | Published: 2023-12-07

Defense against ML-based Power Side-channel Attacks on DNN Accelerators with Adversarial Attacks

Authors: Xiaobei Yan, Chip Hong Chang, Tianwei Zhang | Published: 2023-12-07

Understanding (Un)Intended Memorization in Text-to-Image Generative Models

Authors: Ali Naseh, Jaechul Roh, Amir Houmansadr | Published: 2023-12-06

Dr. Jekyll and Mr. Hyde: Two Faces of LLMs

Authors: Matteo Gioele Collu, Tom Janssen-Groesbeek, Stefanos Koffas, Mauro Conti, Stjepan Picek | Published: 2023-12-06 | Updated: 2024-10-07

Feature Analysis of Encrypted Malicious Traffic

Authors: Anish Singh Shekhawat, Fabio Di Troia, Mark Stamp | Published: 2023-12-06