Literature Database

Can Go AIs be adversarially robust?

Authors: Tom Tseng, Euan McLean, Kellin Pelrine, Tony T. Wang, Adam Gleave | Published: 2024-06-18 | Updated: 2025-01-14

Model Performance Evaluation

Attack Method

Watermark Evaluation

2024.06.18 2025.05.27

Literature Database

UIFV: Data Reconstruction Attack in Vertical Federated Learning

Authors: Jirui Yang, Peng Chen, Zhihui Lu, Qiang Duan, Yubing Bao | Published: 2024-06-18 | Updated: 2025-01-14

Data Privacy Assessment

Framework

Attack Method

2024.06.18 2025.05.27

Literature Database

Defending Against Social Engineering Attacks in the Age of LLMs

Authors: Lin Ai, Tharindu Kumarage, Amrita Bhattacharjee, Zizhou Liu, Zheng Hui, Michael Davinroy, James Cook, Laura Cassani, Kirill Trapeznikov, Matthias Kirchner, Arslan Basharat, Anthony Hoogs, Joshua Garland, Huan Liu, Julia Hirschberg | Published: 2024-06-18 | Updated: 2024-10-11

Indirect Prompt Injection

Cyber Threat

Social Engineering Attack

2024.06.18 2025.05.27

Literature Database

CleanGen: Mitigating Backdoor Attacks for Generation Tasks in Large Language Models

Authors: Yuetai Li, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Dinuka Sahabandu, Bhaskar Ramasubramanian, Radha Poovendran | Published: 2024-06-18 | Updated: 2025-03-27

LLM Security

Backdoor Attack

Prompt Injection

2024.06.18 2025.05.27

Literature Database

Is poisoning a real threat to LLM alignment? Maybe more so than you think

Authors: Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang | Published: 2024-06-17 | Updated: 2025-06-09

Training Method

Backdoor Attack Techniques

Detection of Poisonous Data

2024.06.17 2025.06.11

Literature Database

Knowledge-to-Jailbreak: Investigating Knowledge-driven Jailbreaking Attacks for Large Language Models

Authors: Shangqing Tu, Zhuoran Pan, Wenxuan Wang, Zhexin Zhang, Yuliang Sun, Jifan Yu, Hongning Wang, Lei Hou, Juanzi Li | Published: 2024-06-17 | Updated: 2025-06-09

Cooperative Effects with LLM

Prompt Injection

Large Language Model

2024.06.17 2025.06.11

Literature Database

FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks

Authors: Tobias Lorenz, Marta Kwiatkowska, Mario Fritz | Published: 2024-06-17 | Updated: 2024-09-11

Security Assurance

Convergence Analysis

Optimization Problem

2024.06.17 2025.05.27

Literature Database

ChatBug: A Common Vulnerability of Aligned LLMs Induced by Chat Templates

Authors: Fengqing Jiang, Zhangchen Xu, Luyao Niu, Bill Yuchen Lin, Radha Poovendran | Published: 2024-06-17 | Updated: 2025-01-07

LLM Security

Prompt Injection

Vulnerability Management

2024.06.17 2025.05.27

Literature Database

GoldCoin: Grounding Large Language Models in Privacy Laws via Contextual Integrity Theory

Authors: Wei Fan, Haoran Li, Zheye Deng, Weiqi Wang, Yangqiu Song | Published: 2024-06-17 | Updated: 2024-10-04

LLM Performance Evaluation

Privacy Protection Method

Prompt Injection

2024.06.17 2025.05.27

Literature Database

Threat Modelling and Risk Analysis for Large Language Model (LLM)-Powered Applications

Authors: Stephen Burabari Tete | Published: 2024-06-16

LLM Security

Prompt Injection

Risk Management

2024.06.16 2025.05.27

Literature Database