AIセキュリティポータルbot | Page 174

From Principle to Practice: Vertical Data Minimization for Machine Learning

Authors: Robin Staab, Nikola Jovanović, Mislav Balunović, Martin Vechev | Published: 2023-11-17 | Updated: 2023-11-22

Data Management System

Privacy Protection

Evaluation Method

2023.11.17 2025.05.28

Literature Database

FedTruth: Byzantine-Robust and Backdoor-Resilient Federated Learning Framework

Authors: Sheldon C. Ebron Jr., Kan Yang | Published: 2023-11-17

Model Architecture

Attack Method

Evaluation Method

2023.11.17 2025.05.28

Literature Database

You Cannot Escape Me: Detecting Evasions of SIEM Rules in Enterprise Networks

Authors: Rafael Uetz, Marco Herzog, Louis Hackländer, Simon Schwarz, Martin Henze | Published: 2023-11-16 | Updated: 2023-12-19

Rule Attribution

Attack Method

Adaptive Misuse Detection

2023.11.16 2025.05.28

Literature Database

Improving the Generation Quality of Watermarked Large Language Models via Word Importance Scoring

Authors: Yuhang Li, Yihan Wang, Zhouxing Shi, Cho-Jui Hsieh | Published: 2023-11-16

Token Collection Method

Improvement of Learning

Deep Learning Method

2023.11.16 2025.05.28

Literature Database

Bergeron: Combating Adversarial Attacks through a Conscience-Based Alignment Framework

Authors: Matthew Pisano, Peter Ly, Abraham Sanders, Bingsheng Yao, Dakuo Wang, Tomek Strzalkowski, Mei Si | Published: 2023-11-16 | Updated: 2024-08-18

Prompt Injection

Multilingual LLM Jailbreak

Adversarial attack

2023.11.16 2025.05.28

Literature Database

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections

Authors: Yuanpu Cao, Bochuan Cao, Jinghui Chen | Published: 2023-11-15 | Updated: 2024-06-09

Backdoor Attack

Prompt Injection

2023.11.15 2025.05.28

Literature Database

HAL 9000: Skynet’s Risk Manager

Authors: Tadeu Freitas, Mário Neto, Inês Dutra, João Soares, Manuel Correia, Rolando Martins | Published: 2023-11-15

Software Security

Machine Learning Method

Vulnerability Management

2023.11.15 2025.05.28

Literature Database

Trojan Activation Attack: Red-Teaming Large Language Models using Activation Steering for Safety-Alignment

Authors: Haoran Wang, Kai Shu | Published: 2023-11-15 | Updated: 2024-08-15

Prompt Injection

Attack Method

Natural Language Processing

2023.11.15 2025.05.28

Literature Database

Are Normalizing Flows the Key to Unlocking the Exponential Mechanism?

Authors: Robert A. Bridges, Vandy J. Tombs, Christopher B. Stanley | Published: 2023-11-15 | Updated: 2024-06-11

Privacy Protection

Convergence Property

Machine Learning Method

2023.11.15 2025.05.28

Literature Database

Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts

Authors: Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun | Published: 2023-11-15 | Updated: 2024-01-20

Prompt Injection

Attack Method

Face Recognition

2023.11.15 2025.05.28

Literature Database