Model Robustness

DREAM: Dynamic Red-teaming across Environments for AI Models

Authors: Liming Lu, Xiang Gu, Junyu Huang, Jiawei Du, Yunhuai Liu, Yongbin Zhou, Shuchao Pang | Published: 2025-12-22

Model Robustness

動的攻撃評価手法

Vulnerability Attack Method

2025.12.22 2025.12.24

Literature Database

Beyond Text: Multimodal Jailbreaking of Vision-Language and Audio Models through Perceptually Simple Transformations

Authors: Divyanshu Kumar, Shreyas Jena, Nitin Aravind Birur, Tanay Baswa, Sahil Agarwal, Prashanth Harshangi | Published: 2025-10-23

Model Robustness

Large Language Model

攻撃手法評価

2025.10.23 2025.10.25

Literature Database

SAID: Empowering Large Language Models with Self-Activating Internal Defense

Authors: Yulong Chen, Yadong Liu, Jiawen Zhang, Mu Li, Chao Huang, Jie Wen | Published: 2025-10-23

Prompt Injection

Model Robustness

Large Language Model

2025.10.23 2025.10.25

Literature Database

The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models

Authors: Euodia Dodd, Nataša Krčo, Igor Shilov, Yves-Alexandre de Montjoye | Published: 2025-10-22

Privacy-Preserving Machine Learning

Model Robustness

Low-Cost Membership Inference Method

2025.10.22 2025.10.24

Literature Database

Exploring the Effect of DNN Depth on Adversarial Attacks in Network Intrusion Detection Systems

Authors: Mohamed ElShehaby, Ashraf Matrawy | Published: 2025-10-22

Network Threat Detection

Model Robustness

Certified Robustness

2025.10.22 2025.10.24

Literature Database

Can You Trust What You See? Alpha Channel No-Box Attacks on Video Object Detection

Authors: Ariana Yi, Ce Zhou, Liyang Xiao, Qiben Yan | Published: 2025-10-22

Platform Architecture

Model Robustness

Research Methodology

2025.10.22 2025.10.24

Literature Database

SentinelNet: Safeguarding Multi-Agent Collaboration Through Credit-Based Dynamic Threat Detection

Authors: Yang Feng, Xudong Pan | Published: 2025-10-17 | Updated: 2025-10-21

エージェント設計

Network Threat Detection

Model Robustness

2025.10.17 2025.10.23

Literature Database

TrafficLLM: Enhancing Large Language Models for Network Traffic Analysis with Generic Traffic Representation

Authors: Tianyu Cui, Xinjie Lin, Sijia Li, Miao Chen, Qilei Yin, Qi Li, Ke Xu | Published: 2025-04-05 | Updated: 2025-04-15

LLM Performance Evaluation

Task-Specific Tuning

Model Robustness

2025.04.05 2025.05.27

Literature Database

Robust LLM safeguarding via refusal feature adversarial training

Authors: Lei Yu, Virginie Do, Karen Hambardzumyan, Nicola Cancedda | Published: 2024-09-30 | Updated: 2025-03-20

Prompt Injection

Model Robustness

Adversarial Learning

2024.09.30 2025.05.27

Literature Database

Stealing Part of a Production Language Model

Authors: Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Itay Yona, Eric Wallace, David Rolnick, Florian Tramèr | Published: 2024-03-11 | Updated: 2024-07-09

Prompt leaking

Model Robustness

Model Extraction Attack

2024.03.11 2025.05.27

Literature Database