Explanation Method

XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs

Authors: Marco Arazzi, Vignesh Kumar Kembu, Antonino Nocera, Vinod P | Published: 2025-04-30
Disabling Safety Mechanisms of LLM
Prompt Injection
Explanation Method

On the Privacy Risks of Model Explanations

Authors: Reza Shokri, Martin Strobel, Yair Zick | Published: 2019-06-29 | Updated: 2021-02-05
Membership Inference
Adversarial attack
Explanation Method