AIセキュリティポータルbot

Making Them Ask and Answer: Jailbreaking Large Language Models in Few Queries via Disguise and Reconstruction

Authors: Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, Kai Chen | Published: 2024-02-28 | Updated: 2024-06-10
LLM Security
LLM Performance Evaluation
Prompt Injection

ChatSpamDetector: Leveraging Large Language Models for Effective Phishing Email Detection

Authors: Takashi Koide, Naoki Fukushi, Hiroki Nakano, Daiki Chiba | Published: 2024-02-28 | Updated: 2024-08-23
Phishing Detection
Prompt Injection
Email Security

Token-Specific Watermarking with Enhanced Detectability and Semantic Coherence for Large Language Models

Authors: Mingjia Huo, Sai Ashish Somayajula, Youwei Liang, Ruisi Zhang, Farinaz Koushanfar, Pengtao Xie | Published: 2024-02-28 | Updated: 2024-06-06
Watermarking
Prompt Injection
Multi-Objective Optimization

Multistatic-Radar RCS-Signature Recognition of Aerial Vehicles: A Bayesian Fusion Approach

Authors: Michael Potter, Murat Akcakaya, Marius Necsoiu, Gunar Schirner, Deniz Erdogmus, Tales Imbiriba | Published: 2024-02-28 | Updated: 2024-08-16
Training Data Generation
Bayesian Classification
Machine Learning Method

Robustness-Congruent Adversarial Training for Secure Machine Learning Model Updates

Authors: Daniele Angioni, Luca Demetrio, Maura Pintor, Luca Oneto, Davide Anguita, Battista Biggio, Fabio Roli | Published: 2024-02-27 | Updated: 2025-05-29
Model Design
Robustness Evaluation
Adversarial Learning

An Investigation into the Performances of the State-of-the-art Machine Learning Approaches for Various Cyber-attack Detection: A Survey

Authors: Tosin Ige, Christopher Kiekintveld, Aritran Piplai | Published: 2024-02-26 | Updated: 2024-05-10
SQL Injection Attack Detection
Phishing Detection
Machine Learning Method

Improving behavior based authentication against adversarial attack using XAI

Authors: Dong Qin, George Amariucai, Daji Qiao, Yong Guan | Published: 2024-02-26 | Updated: 2024-03-10
Adversarial Training
Feature Selection Method
Defense Method

LLMs Can Defend Themselves Against Jailbreaking in a Practical Manner: A Vision Paper

Authors: Daoyuan Wu, Shuai Wang, Yang Liu, Ning Liu | Published: 2024-02-24 | Updated: 2024-03-04
LLM Security
Prompt Injection
Prompt Engineering

On Trojan Signatures in Large Language Models of Code

Authors: Aftab Hussain, Md Rafiqul Islam Rabin, Mohammad Amin Alipour | Published: 2024-02-23 | Updated: 2024-03-07
LLM Security
Trojan Horse Signature
Trojan Detection

Verifiable Boosted Tree Ensembles

Authors: Stefano Calzavara, Lorenzo Cazzaro, Claudio Lucchese, Giulio Ermanno Pibiri | Published: 2024-02-22
Model Performance Evaluation
Robustness Evaluation
Optimization Problem