AIセキュリティポータルbot

Automated Trustworthiness Oracle Generation for Machine Learning Text Classifiers

Authors: Lam Nguyen Tung, Steven Cho, Xiaoning Du, Neelofar Neelofar, Valerio Terragni, Stefano Ruberto, Aldeida Aleti | Published: 2024-10-30 | Updated: 2025-04-23
XAI (Explainable AI)
Model Performance Evaluation
Reliability Analysis

CausAdv: A Causal-based Framework for Detecting Adversarial Examples

Authors: Hichem Debbi | Published: 2024-10-29
Framework
Adversarial Example

Privacy-Preserving Dynamic Assortment Selection

Authors: Young Hyun Cho, Will Wei Sun | Published: 2024-10-29
Privacy Protection
Privacy Protection Method
Optimization Problem

Resilience in Knowledge Graph Embeddings

Authors: Arnab Sharma, N'Dah Jean Kouagou, Axel-Cyrille Ngonga Ngomo | Published: 2024-10-28
Membership Inference
Defense Method

CTINexus: Automatic Cyber Threat Intelligence Knowledge Graph Construction Using Large Language Models

Authors: Yutong Cheng, Osama Bajaber, Saimon Amanuel Tsegai, Dawn Song, Peng Gao | Published: 2024-10-28 | Updated: 2025-04-21
Cyber Threat Intelligence
Prompt leaking
Watermarking Technology

Integrating uncertainty quantification into randomized smoothing based robustness guarantees

Authors: Sina Däubener, Kira Maag, David Krueger, Asja Fischer | Published: 2024-10-27
Adversarial Example
Equivalence Evaluation

On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe | Published: 2024-10-21
Convergence Analysis
Adversarial Training

Jailbreaking and Mitigation of Vulnerabilities in Large Language Models

Authors: Benji Peng, Keyu Chen, Qian Niu, Ziqian Bi, Ming Liu, Pohsun Feng, Tianyang Wang, Lawrence K. Q. Yan, Yizhu Wen, Yichao Zhang, Caitlyn Heqi Yin | Published: 2024-10-20 | Updated: 2025-05-08
LLM Security
Disabling Safety Mechanisms of LLM
Prompt Injection

A Novel Reinforcement Learning Model for Post-Incident Malware Investigations

Authors: Dipo Dunsin, Mohamed Chahine Ghanem, Karim Ouazzane, Vassil Vassilev | Published: 2024-10-19 | Updated: 2025-01-12
Cybersecurity
Malware Classification

Low-Rank Adversarial PGD Attack

Authors: Dayana Savostianova, Emanuele Zangrando, Francesco Tudisco | Published: 2024-10-16
Attack Method