AIセキュリティポータルbot | Page 69

“Moralized” Multi-Step Jailbreak Prompts: Black-Box Testing of Guardrails in Large Language Models for Verbal Attacks

Authors: Libo Wang | Published: 2024-11-23 | Updated: 2025-03-20

Prompt Injection

Large Language Model

2024.11.23 2025.05.27

Literature Database

Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians

Authors: William N. Caballero, Matthew LaRosa, Alexander Fisher, Vahid Tarokh | Published: 2024-11-21

Attack Method

Optimization Problem

2024.11.21 2025.05.27

Literature Database

Attribute Inference Attacks for Federated Regression Tasks

Authors: Francesco Diana, Othmane Marfoq, Chuan Xu, Giovanni Neglia, Frédéric Giroire, Eoin Thomas | Published: 2024-11-19 | Updated: 2025-04-16

Privacy Enhancing Protocol

Label Inference Attack

Federated Learning

2024.11.19 2025.05.27

Literature Database

PEEK: Phishing Evolution Framework for Phishing Generation and Evolving Pattern Analysis using Large Language Models

Authors: Fengchao Chen, Tingmin Wu, Van Nguyen, Shuo Wang, Alsharif Abuadbba, Carsten Rudolph | Published: 2024-11-18 | Updated: 2025-05-06

LLM Performance Evaluation

Prompt leaking

Promotion of Diversity

2024.11.18 2025.05.27

Literature Database

Variational Bayesian Bow tie Neural Networks with Shrinkage

Authors: Alisa Sheinkman, Sara Wade | Published: 2024-11-17 | Updated: 2024-11-19

Sparse Model

Optimization Problem

Evaluation Method

2024.11.17 2025.05.27

Literature Database

JailbreakLens: Interpreting Jailbreak Mechanism in the Lens of Representation and Circuit

Authors: Zeqing He, Zhibo Wang, Zhixuan Chu, Huiyu Xu, Wenhui Zhang, Qinglong Wang, Rui Zheng | Published: 2024-11-17 | Updated: 2025-04-24

Disabling Safety Mechanisms of LLM

Prompt Injection

Large Language Model

2024.11.17 2025.05.27

Literature Database

Combining Machine Learning Defenses without Conflicts

Authors: Vasisht Duddu, Rui Zhang, N. Asokan | Published: 2024-11-14 | Updated: 2025-08-14

Certified Robustness

Watermark Evaluation

防御手法の統合

2024.11.14 2025.08.16

Literature Database

TinyML NLP Scheme for Semantic Wireless Sentiment Classification with Privacy Preservation

Authors: Ahmed Y. Radwan, Mohammad Shehab, Mohamed-Slim Alouini | Published: 2024-11-09 | Updated: 2025-04-21

Energy-Based Model

Privacy Protection

Communication Model

2024.11.09 2025.05.27

Literature Database

Unmasking the Shadows: Pinpoint the Implementations of Anti-Dynamic Analysis Techniques in Malware Using LLM

Authors: Haizhou Wang, Nanqing Luo, Xusheng Li, Peng LIu | Published: 2024-11-08 | Updated: 2025-04-29

Malware Evolution

Attack Method

Analysis of Detection Methods

2024.11.08 2025.05.27

Literature Database

Free Record-Level Privacy Risk Evaluation Through Artifact-Based Methods

Authors: Joseph Pollock, Igor Shilov, Euodia Dodd, Yves-Alexandre de Montjoye | Published: 2024-11-08 | Updated: 2025-06-12

Performance Evaluation

Membership Inference

Differential Privacy

2024.11.08 2025.06.14

Literature Database