Natural Language Processing

Trojan Activation Attack: Red-Teaming Large Language Models using Activation Steering for Safety-Alignment

Authors: Haoran Wang, Kai Shu | Published: 2023-11-15 | Updated: 2024-08-15
Prompt Injection
Attack Method
Natural Language Processing

Privately Aligning Language Models with Reinforcement Learning

Authors: Fan Wu, Huseyin A. Inan, Arturs Backurs, Varun Chandrasekaran, Janardhan Kulkarni, Robert Sim | Published: 2023-10-25 | Updated: 2024-05-03
Privacy Technique
Model Design
Natural Language Processing

Detecting Pretraining Data from Large Language Models

Authors: Weijia Shi, Anirudh Ajith, Mengzhou Xia, Yangsibo Huang, Daogao Liu, Terra Blevins, Danqi Chen, Luke Zettlemoyer | Published: 2023-10-25 | Updated: 2024-03-09
Data Generation
Natural Language Processing
Copyright Trap

Time Travel in LLMs: Tracing Data Contamination in Large Language Models

Authors: Shahriar Golchin, Mihai Surdeanu | Published: 2023-08-16 | Updated: 2024-02-21
Data Contamination Detection
Prompt Injection
Natural Language Processing

Revolutionizing Cyber Threat Detection with Large Language Models: A privacy-preserving BERT-based Lightweight Model for IoT/IIoT Devices

Authors: Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C. Cordeiro, Merouane Debbah, Thierry Lestable, Narinderjit Singh Thandi | Published: 2023-06-25 | Updated: 2024-02-08
Malware Detection Method
Feature Extraction Method
Natural Language Processing

On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions

Authors: Reza Fayyazi, Shanchieh Jay Yang | Published: 2023-06-24 | Updated: 2023-08-22
Prompt Injection
Malware Classification
Natural Language Processing

Automated Mapping of CVE Vulnerability Records to MITRE CWE Weaknesses

Authors: Ashraf Haddad, Najwa Aaraj, Preslav Nakov, Septimiu Fabian Mare | Published: 2023-04-13
Security Analysis
Dataset Generation
Natural Language Processing

Bayesian Attention Belief Networks

Authors: Shujian Zhang, Xinjie Fan, Bo Chen, Mingyuan Zhou | Published: 2021-06-09
Natural Language Processing
Computational Efficiency
Evaluation Method

Resilient and Adaptive Framework for Large Scale Android Malware Fingerprinting using Deep Learning and NLP Techniques

Authors: ElMouatez Billah Karbab, Mourad Debbabi | Published: 2021-05-27
Data-Driven Clustering
Malware Propagation Means
Natural Language Processing

Killing One Bird with Two Stones: Model Extraction and Attribute Inference Attacks against BERT-based APIs

Authors: Chen Chen, Xuanli He, Lingjuan Lyu, Fangzhao Wu | Published: 2021-05-23 | Updated: 2021-12-26
Privacy Protection Method
Membership Inference
Natural Language Processing