CyberThreat-Eval: Can Large Language Models Automate Real-World Threat Research?
Authors: Xiangsen Chen, Xuan Feng, Shuo Chen, Matthieu Maitre, Sudipto Rakshit, Diana Duvieilh, Ashley Picone, Nan Tang | Published: 2026-03-10
Disabling Safety Mechanisms of LLM
LLM Performance Evaluation
Indirect Prompt Injection