文献データベース

The Dark Side of Human Feedback: Poisoning Large Language Models via User Inputs

Authors: Bocheng Chen, Hanqing Guo, Guangjing Wang, Yuanda Wang, Qiben Yan | Published: 2024-09-01
LLM性能評価
プロンプトインジェクション
ポイズニング

Comprehensive Botnet Detection by Mitigating Adversarial Attacks, Navigating the Subtleties of Perturbation Distances and Fortifying Predictions with Conformal Layers

Authors: Rahul Yumlembam, Biju Issac, Seibu Mary Jacob, Longzhi Yang | Published: 2024-09-01
ポイズニング
敵対的サンプル
評価手法

Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models

Authors: Bang An, Sicheng Zhu, Ruiyi Zhang, Michael-Andrei Panaitescu-Liess, Yuancheng Xu, Furong Huang | Published: 2024-09-01
LLM性能評価
コンテンツモデレーション
プロンプトインジェクション

Enhancing Source Code Security with LLMs: Demystifying The Challenges and Generating Reliable Repairs

Authors: Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Elias Bou-Harb, Peyman Najafirad | Published: 2024-09-01
LLMセキュリティ
脆弱性管理
自動脆弱性修復

Is Difficulty Calibration All We Need? Towards More Practical Membership Inference Attacks

Authors: Yu He, Boheng Li, Yao Wang, Mengda Yang, Juan Wang, Hongxin Hu, Xingyu Zhao | Published: 2024-08-31 | Updated: 2024-09-04
メンバーシップ推論
攻撃手法
難易度キャリブレーション

Ethical Challenges in Computer Vision: Ensuring Privacy and Mitigating Bias in Publicly Available Datasets

Authors: Ghalib Ahmed Tahir | Published: 2024-08-31 | Updated: 2025-08-11
データ収集
倫理的ガイドライン遵守
公平性の確保

AI-Driven Intrusion Detection Systems (IDS) on the ROAD Dataset: A Comparative Analysis for Automotive Controller Area Network (CAN)

Authors: Lorenzo Guerra, Linhan Xu, Paolo Bellavista, Thomas Chapuis, Guillaume Duc, Pavlo Mozharovskyi, Van-Tam Nguyen | Published: 2024-08-30 | Updated: 2024-09-05
攻撃手法
自動化された侵入検知システム
車両ネットワークセキュリティ

Different Victims, Same Layout: Email Visual Similarity Detection for Enhanced Email Protection

Authors: Sachin Shukla, Omid Mirzaei | Published: 2024-08-29 | Updated: 2024-09-04
ウォーターマーキング
スパム検出
視覚的類似性検出

Analyzing Inference Privacy Risks Through Gradients in Machine Learning

Authors: Zhuohang Li, Andrew Lowy, Jing Liu, Toshiaki Koike-Akino, Kieran Parsons, Bradley Malin, Ye Wang | Published: 2024-08-29
プライバシー保護手法
ポイズニング
メンバーシップ推論

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Authors: Nathaniel Li, Ziwen Han, Ian Steneker, Willow Primack, Riley Goodside, Hugh Zhang, Zifan Wang, Cristina Menghini, Summer Yue | Published: 2024-08-27 | Updated: 2024-09-04
プロンプトインジェクション
ユーザー教育
攻撃手法