AIセキュリティポータルbot

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Authors: Weiyan Shi, Ryan Shea, Si Chen, Chiyuan Zhang, Ruoxi Jia, Zhou Yu | Published: 2022-04-15 | Updated: 2022-10-27
プライバシー保護技術
機械学習の応用
秘密検出器

Investigating Positive and Negative Qualities of Human-in-the-Loop Optimization for Designing Interaction Techniques

Authors: Liwei Chan, Yi-Chi Liao, George B. Mo, John J. Dudley, Chun-Lien Cheng, Per Ola Kristensson, Antti Oulasvirta | Published: 2022-04-15
ヒューマンインザループ
ベイズ最適化
最適化問題

Finding MNEMON: Reviving Memories of Node Embeddings

Authors: Yun Shen, Yufei Han, Zhikun Zhang, Min Chen, Ting Yu, Michael Backes, Yang Zhang, Gianluca Stringhini | Published: 2022-04-14 | Updated: 2022-04-29
アルゴリズム設計
データセット評価
評価指標

LSTM-Autoencoder based Anomaly Detection for Indoor Air Quality Time Series Data

Authors: Yuanyuan Wei, Julian Jang-Jaccard, Wen Xu, Fariza Sabrina, Seyit Camtepe, Mikael Boulic | Published: 2022-04-14
アルゴリズム設計
データ抽出と分析
機械学習の応用

A Natural Language Processing Approach for Instruction Set Architecture Identification

Authors: Dinuka Sahabandu, Sukarno Mertoguno, Radha Poovendran | Published: 2022-04-13
データ抽出と分析
プログラム理解
機械学習の応用

Improving Differential-Neural Distinguisher Model For DES, Chaskey, and PRESENT

Authors: Liu Zhang, Zilong Wang | Published: 2022-04-13
アルゴリズム設計
実験的検証
評価指標

Overparameterized Linear Regression under Adversarial Attacks

Authors: Antônio H. Ribeiro, Thomas B. Schön | Published: 2022-04-13 | Updated: 2023-01-27
敵対的サンプル
機械学習の応用
線形モデル

Stealing and Evading Malware Classifiers and Antivirus at Low False Positive Conditions

Authors: Maria Rigaki, Sebastian Garcia | Published: 2022-04-13 | Updated: 2023-06-04
データセット評価
モデル抽出攻撃

Machine Learning Security against Data Poisoning: Are We There Yet?

Authors: Antonio Emanuele Cinà, Kathrin Grosse, Ambra Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo | Published: 2022-04-12 | Updated: 2024-03-08
ポイズニング
攻撃タイプ
防御手法

Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

Authors: Yuntao Bai, Andy Jones, Kamal Ndousse, Amanda Askell, Anna Chen, Nova DasSarma, Dawn Drain, Stanislav Fort, Deep Ganguli, Tom Henighan, Nicholas Joseph, Saurav Kadavath, Jackson Kernion, Tom Conerly, Sheer El-Showk, Nelson Elhage, Zac Hatfield-Dodds, Danny Hernandez, Tristan Hume, Scott Johnston, Shauna Kravec, Liane Lovitt, Neel Nanda, Catherine Olsson, Dario Amodei, Tom Brown, Jack Clark, Sam McCandlish, Chris Olah, Ben Mann, Jared Kaplan | Published: 2022-04-12
アライメント
強化学習最適化
性能評価