Literature Database

What Features in Prompts Jailbreak LLMs? Investigating the Mechanisms Behind Attacks

Authors: Nathalie Kirch, Constantin Weisser, Severin Field, Helen Yannakoudakis, Stephen Casper | Published: 2024-11-02 | Updated: 2025-05-14
Disabling Safety Mechanisms of LLM
Prompt Injection
Exploratory Attack

Privacy-Preserving Federated Learning with Differentially Private Hyperdimensional Computing

Authors: Fardin Jalil Piran, Zhiling Chen, Mohsen Imani, Farhad Imani | Published: 2024-11-02 | Updated: 2025-03-22
Privacy Protection
Framework

Defense Against Prompt Injection Attack by Leveraging Attack Techniques

Authors: Yulin Chen, Haoran Li, Zihao Zheng, Yangqiu Song, Dekai Wu, Bryan Hooi | Published: 2024-11-01 | Updated: 2025-07-22
Indirect Prompt Injection
Prompt Injection
Attack Method

Attention Tracker: Detecting Prompt Injection Attacks in LLMs

Authors: Kuo-Han Hung, Ching-Yun Ko, Ambrish Rawat, I-Hsin Chung, Winston H. Hsu, Pin-Yu Chen | Published: 2024-11-01 | Updated: 2025-04-23
Indirect Prompt Injection
Large Language Model
Attention Mechanism

Efficient Model Compression for Bayesian Neural Networks

Authors: Diptarka Saha, Zihe Liu, Feng Liang | Published: 2024-11-01
Sparse Model
Model Performance Evaluation
Optimization Problem

Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers

Authors: Lam Nguyen Tung, Steven Cho, Xiaoning Du, Neelofar Neelofar, Valerio Terragni, Stefano Ruberto, Aldeida Aleti | Published: 2024-10-30 | Updated: 2025-04-23
XAI (Explainable AI)
Model Performance Evaluation
Reliability Analysis

CausAdv: A Causal-based Framework for Detecting Adversarial Examples

Authors: Hichem Debbi | Published: 2024-10-29
Framework
Adversarial Example

Privacy-Preserving Dynamic Assortment Selection

Authors: Young Hyun Cho, Will Wei Sun | Published: 2024-10-29
Privacy Protection
Privacy Protection Method
Optimization Problem

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, Axel-Cyrille Ngonga Ngomo | Published: 2024-10-28
Membership Inference
Defense Method

CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

Authors: Yutong Cheng, Osama Bajaber, Saimon Amanuel Tsegai, Dawn Song, Peng Gao | Published: 2024-10-28 | Updated: 2025-04-21
Cyber Threat Intelligence
Prompt leaking
Watermarking Technology