セキュリティ保証

Jailbroken: How Does LLM Safety Training Fail?

Authors: Alexander Wei, Nika Haghtalab, Jacob Steinhardt | Published: 2023-07-05
セキュリティ保証
プロンプトインジェクション
敵対的攻撃手法

Vulnerable Source Code Detection using SonarCloud Code Analysis

Authors: Alifia Puspaningrum, Muhammad Anis Al Hilmi, Darsih, Muhamad Mustamiin, Maulana Ilham Ginanjar | Published: 2023-07-05
コード変更分析
システム観測性
セキュリティ保証

Overconfidence is a Dangerous Thing: Mitigating Membership Inference Attacks by Enforcing Less Confident Prediction

Authors: Zitao Chen, Karthik Pattabiraman | Published: 2023-07-04
セキュリティ保証
データ漏洩
メンバーシップ推論

New intelligent defense systems to reduce the risks of Selfish Mining and Double-Spending attacks using Learning Automata

Authors: Seyed Ardalan Ghoreishi, Mohammad Reza Meybodi | Published: 2023-07-02 | Updated: 2024-03-08
アルゴリズム設計
セキュリティ保証
強化学習環境

Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD

Authors: Anvith Thudi, Hengrui Jia, Casey Meehan, Ilia Shumailov, Nicolas Papernot | Published: 2023-07-01 | Updated: 2024-07-16
セキュリティ保証
データの隠蔽
プライバシー分析

Large Language Models for Code: Security Hardening and Adversarial Testing

Authors: Jingxuan He, Martin Vechev | Published: 2023-02-10 | Updated: 2024-08-16
セキュリティ保証
プロンプトインジェクション
脆弱性分析

RADAR: A TTP-based Extensible, Explainable, and Effective System for Network Traffic Analysis and Malware Detection

Authors: Yashovardhan Sharma, Simon Birnbach, Ivan Martinovic | Published: 2022-12-07 | Updated: 2023-04-13
セキュリティ保証
ソフトウェアセキュリティ
評価手法

Targets in Reinforcement Learning to solve Stackelberg Security Games

Authors: Saptarashmi Bandyopadhyay, Chenqi Zhu, Philip Daniel, Joshua Morrison, Ethan Shay, John Dickerson | Published: 2022-11-30
アルゴリズム設計
スタッキングモデル
セキュリティ保証

BLADERUNNER: Rapid Countermeasure for Synthetic (AI-Generated) StyleGAN Faces

Authors: Adam Dorian Wong | Published: 2022-10-12 | Updated: 2022-10-28
DNN IP保護手法
セキュリティ保証
生成モデルの課題

A Certifiable Security Patch for Object Tracking in Self-Driving Systems via Historical Deviation Modeling

Authors: Xudong Pan, Qifan Xiao, Mi Zhang, Min Yang | Published: 2022-07-18
アルゴリズム設計
セキュリティ保証
状態推定手法