Large Language Model

EditMF: Drawing an Invisible Fingerprint for Your Large Language Models

Authors: Jiaxuan Wu, Yinghan Zhou, Wanli Peng, Yiming Xue, Juan Wen, Ping Zhong | Published: 2025-08-12
Large Language Model
Author Attribution Method
Watermark Design

Repairing vulnerabilities without invisible hands. A differentiated replication study on LLMs

Authors: Maria Camporese, Fabio Massacci | Published: 2025-07-28
Prompt Injection
Large Language Model
Vulnerability Management

ARMOR: Aligning Secure and Safe Large Language Models via Meticulous Reasoning

Authors: Zhengyue Zhao, Yingzi Ma, Somesh Jha, Marco Pavone, Patrick McDaniel, Chaowei Xiao | Published: 2025-07-14 | Updated: 2025-10-20
Large Language Model
安全性分析
評価基準

GuardVal: Dynamic Large Language Model Jailbreak Evaluation for Comprehensive Safety Testing

Authors: Peiyan Zhang, Haibo Jin, Liying Kang, Haohan Wang | Published: 2025-07-10
Prompt validation
Large Language Model
Performance Evaluation Metrics

Hybrid LLM-Enhanced Intrusion Detection for Zero-Day Threats in IoT Networks

Authors: Mohammad F. Al-Hammouri, Yazan Otoum, Rasha Atwa, Amiya Nayak | Published: 2025-07-10
Hybrid Algorithm
Prompt Injection
Large Language Model

The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation

Authors: Alexander Xiong, Xuandong Zhao, Aneesh Pappu, Dawn Song | Published: 2025-07-08
Prompt leaking
メモリ化メカニズム
Large Language Model

The Hidden Threat in Plain Text: Attacking RAG Data Loaders

Authors: Alberto Castagnaro, Umberto Salviati, Mauro Conti, Luca Pajola, Simeone Pizzi | Published: 2025-07-07
Poisoning attack on RAG
Large Language Model
Adversarial attack

Are AI-Generated Fixes Secure? Analyzing LLM and Agent Patches on SWE-bench

Authors: Amirali Sajadi, Kostadin Damevski, Preetha Chatterjee | Published: 2025-06-30 | Updated: 2025-07-24
Software Security
Prompt Injection
Large Language Model

SoK: Semantic Privacy in Large Language Models

Authors: Baihe Ma, Yanna Jiang, Xu Wang, Guangshen Yu, Qin Wang, Caijun Sun, Chen Li, Xuelei Qi, Ying He, Wei Ni, Ren Ping Liu | Published: 2025-06-30
Semantic Information Extraction
Privacy Protection
Large Language Model

MetaCipher: A Time-Persistent and Universal Multi-Agent Framework for Cipher-Based Jailbreak Attacks for LLMs

Authors: Boyuan Chen, Minghao Shao, Abdul Basit, Siddharth Garg, Muhammad Shafique | Published: 2025-06-27 | Updated: 2025-08-13
Framework
Large Language Model
脱獄攻撃手法