Evaluation Method

Rotated Robustness: A Training-Free Defense against Bit-Flip Attacks on Large Language Models

Authors: Deng Liu, Song Chen | Published: 2026-03-17
Adversarial Learning
Vulnerability Management
Evaluation Method

Exponential-Family Membership Inference: From LiRA and RMIA to BaVarIA

Authors: Rickard Brännvall | Published: 2026-03-12
攻撃計画手法
Machine Learning Algorithm
Evaluation Method

TOSSS: a CVE-based Software Security Benchmark for Large Language Models

Authors: Marc Damie, Murat Bilgehan Ertan, Domenico Essoussi, Angela Makhanu, Gaëtan Peter, Roos Wensveen | Published: 2026-03-11
LLM Performance Evaluation
Prompt Injection
Evaluation Method

Detecting and Eliminating Neural Network Backdoors Through Active Paths with Application to Intrusion Detection

Authors: Eirik Høyheim, Magnus Wiik Eckhoff, Gudmund Grov, Robert Flood, David Aspinall | Published: 2026-03-11
データ毒性
Backdoor Attack
Evaluation Method

Enhancing Network Intrusion Detection Systems: A Multi-Layer Ensemble Approach to Mitigate Adversarial Attacks

Authors: Nasim Soltani, Shayan Nejadshamsi, Zakaria Abou El Houda, Raphael Khoury, Kelton A. P. Costa, Tiago H. Falk, Anderson R. Avila | Published: 2026-03-11
Certified Robustness
Machine Learning Algorithm
Evaluation Method

DeepSight: An All-in-One LM Safety Toolkit

Authors: Bo Zhang, Jiaxuan Guo, Lijun Li, Dongrui Liu, Sujin Chen, Guanxu Chen, Zhijie Zheng, Qihao Lin, Lewen Yan, Chen Qian, Yijin Zhou, Yuyao Wu, Shaoxiong Guo, Tianyi Du, Jingyi Yang, Xuhao Hu, Ziqi Miao, Xiaoya Lu, Jing Shao, Xia Hu | Published: 2026-02-12
Prompt Injection
Large Language Model
Evaluation Method

Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models

Authors: Sri Durga Sai Sowmya Kadali, Evangelos E. Papalexakis | Published: 2026-02-12
Prompt Injection
Experimental Validation
Evaluation Method

TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection

Authors: Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Zou, Ling Lo, Sheng-Ping Yang, Yu-Wen Tseng, Kun-Hsiang Lin, Chia-Ling Chen, Yu-Ting Ta, Yan-Tsung Wang, Po-Ching Chen, Hongxia Xie, Hong-Han Shuai, Wen-Huang Cheng | Published: 2025-12-11
Detection of Hallucinations
Model DoS
Evaluation Method

LLM-Assisted AHP for Explainable Cyber Range Evaluation

Authors: Vyron Kampourakis, Georgios Kavallieratos, Georgios Spathoulas, Vasileios Gkioulos, Sokratis Katsikas | Published: 2025-12-11
XAI (Explainable AI)
Reliability Assessment
Evaluation Method

From Lab to Reality: A Practical Evaluation of Deep Learning Models and LLMs for Vulnerability Detection

Authors: Chaomeng Lu, Bert Lagaisse | Published: 2025-12-11
Certified Robustness
Calculation of Output Harmfulness
Evaluation Method