Towards a standardized methodology and dataset for evaluating LLM-based digital forensic timeline analysis Authors: Hudan Studiawan, Frank Breitinger, Mark Scanlon | Published: 2025-05-06 LLM Performance EvaluationLarge Language ModelEvaluation Method 2025.05.06 2025.05.27 Literature Database
GuidedBench: Measuring and Mitigating the Evaluation Discrepancies of In-the-wild LLM Jailbreak Methods Authors: Ruixuan Huang, Xunguang Wang, Zongjie Li, Daoyuan Wu, Shuai Wang | Published: 2025-02-24 | Updated: 2025-07-09 Prompt Injection脱獄手法Evaluation Method 2025.02.24 2025.07.11 Literature Database
Evaluating and Improving the Robustness of Security Attack Detectors Generated by LLMs Authors: Samuele Pasini, Jinhan Kim, Tommaso Aiello, Rocio Cabrera Lozoya, Antonino Sabetta, Paolo Tonella | Published: 2024-11-27 | Updated: 2025-09-17 RAGPoisoning attack on RAGEvaluation Method 2024.11.27 2025.09.19 Literature Database
Variational Bayesian Bow tie Neural Networks with Shrinkage Authors: Alisa Sheinkman, Sara Wade | Published: 2024-11-17 | Updated: 2024-11-19 Sparse ModelOptimization ProblemEvaluation Method 2024.11.17 2025.05.27 Literature Database
FEDLAD: Federated Evaluation of Deep Leakage Attacks and Defenses Authors: Isaac Baglin, Xiatian Zhu, Simon Hadfield | Published: 2024-11-05 | Updated: 2025-01-05 PoisoningAttack EvaluationEvaluation Method 2024.11.05 2025.05.27 Literature Database
Can LLMs be Scammed? A Baseline Measurement Study Authors: Udari Madhushani Sehwag, Kelly Patel, Francesca Mosca, Vineeth Ravi, Jessica Staddon | Published: 2024-10-14 LLM Performance EvaluationPrompt InjectionEvaluation Method 2024.10.14 2025.05.27 Literature Database
FedCert: Federated Accuracy Certification Authors: Minh Hieu Nguyen, Huu Tien Nguyen, Trung Thanh Nguyen, Manh Duong Nguyen, Trong Nghia Hoang, Truong Thao Nguyen, Phi Le Nguyen | Published: 2024-10-04 Evaluation Method 2024.10.04 2025.05.27 Literature Database
A novel application of Shapley values for large multidimensional time-series data: Applying explainable AI to a DNA profile classification neural network Authors: Lauren Elborough, Duncan Taylor, Melissa Humphries | Published: 2024-09-26 AlgorithmWatermarkingEvaluation Method 2024.09.26 2025.05.27 Literature Database
LLM-Enhanced Software Patch Localization Authors: Jinhong Yu, Yi Chen, Di Tang, Xiaozhong Liu, XiaoFeng Wang, Chen Wu, Haixu Tang | Published: 2024-09-10 | Updated: 2024-09-13 LLM Performance EvaluationUnderstanding Commit ContentEvaluation Method 2024.09.10 2025.05.27 Literature Database
VoiceWukong: Benchmarking Deepfake Voice Detection Authors: Ziwei Yan, Yanjie Zhao, Haoyu Wang | Published: 2024-09-10 Deep Fake Audio EvaluationEvaluation MethodSpeech Synthesis Technology 2024.09.10 2025.05.27 Literature Database