Literature Database

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

Authors: Jona te Lintelo, Lichao Wu, Stjepan Picek | Published: 2026-02-09
Prompt Injection
Large Language Model
安全性分析

Sparse Models, Sparse Safety: Unsafe Routes in Mixture-of-Experts LLMs

Authors: Yukun Jiang, Hai Huang, Mingjie Li, Yage Zhang, Michael Backes, Yang Zhang | Published: 2026-02-09
Sparsity Defense
Prompt Injection
安全性分析

On Protecting Agentic Systems’ Intellectual Property via Watermarking

Authors: Liwen Wang, Zongjie Li, Yuchong Xie, Shuai Wang, Dongdong She, Wei Wang, Juergen Rahmel | Published: 2026-02-09
Watermarking
エージェントシステムの透かし技術
Digital Watermarking for Generative AI

Towards Real-World Industrial-Scale Verification: LLM-Driven Theorem Proving on seL4

Authors: Jianyu Zhang, Fuyuan Zhang, Jiayi Lu, Jilin Hu, Xiaoyi Yin, Long Zhang, Feng Yang, Yongwang Zhao | Published: 2026-02-09
LLM Performance Evaluation
Program Understanding
Transparency and Verification

InfiCoEvalChain: A Blockchain-Based Decentralized Framework for Collaborative LLM Evaluation

Authors: Yifan Yang, Jinjia Li, Kunxi Li, Puhao Zheng, Yuanyi Wang, Zheyan Qu, Yang Yu, Jianmin Wu, Ming Li, Hongxia Yang | Published: 2026-02-09
LLM Performance Evaluation
Incentive Mechanism
Model evaluation methods

Deep Learning for Contextualized NetFlow-Based Network Intrusion Detection: Methods, Data, Evaluation and Deployment

Authors: Abdelkader El Mahdaouy, Issam Ait Yahia, Soufiane Oualil, Ismail Berrada | Published: 2026-02-05
Graph Neural Network
ストリーミング状態管理
異常検知

Clouding the Mirror: Stealthy Prompt Injection Attacks Targeting LLM-based Phishing Detection

Authors: Takashi Koide, Hiroki Nakano, Daiki Chiba | Published: 2026-02-05
Indirect Prompt Injection
フィッシング検出手法
Prompt Injection

BadTemplate: A Training-Free Backdoor Attack via Chat Template Against Large Language Models

Authors: Zihan Wang, Hongwei Li, Rui Zhang, Wenbo Jiang, Guowen Xu | Published: 2026-02-05
LLM Performance Evaluation
データ毒性
Large Language Model

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

Authors: Zhenxiong Yu, Zhi Yang, Zhiheng Jin, Shuhe Wang, Heng Zhang, Yanlin Fei, Lingfeng Zeng, Fangqi Lou, Shuo Zhang, Tu Hu, Jingping Liu, Rongze Chen, Xingyu Zhu, Kunyi Wang, Chaofa Yuan, Xin Guo, Zhaowei Liu, Feipeng Zhang, Jie Huang, Huacan Wang, Ronghao Chen, Liwen Zhang | Published: 2026-02-05
攻撃手法の説明
Content Specialized for Toxicity Attacks

SynAT: Enhancing Security Knowledge Bases via Automatic Synthesizing Attack Tree from Crowd Discussions

Authors: Ziyou Jiang, Lin Shi, Guowei Yang, Xuyan Ma, Fenglong Li, Qing Wang | Published: 2026-02-05
LLM Performance Evaluation
Safety of Data Generation
攻撃ツリー合成