性能評価

Constitutional AI: Harmlessness from AI Feedback

Authors: Yuntao Bai, Saurav Kadavath, Sandipan Kundu, Amanda Askell, Jackson Kernion, Andy Jones, Anna Chen, Anna Goldie, Azalia Mirhoseini, Cameron McKinnon, Carol Chen, Catherine Olsson, Christopher Olah, Danny Hernandez, Dawn Drain, Deep Ganguli, Dustin Li, Eli Tran-Johnson, Ethan Perez, Jamie Kerr, Jared Mueller, Jeffrey Ladish, Joshua Landau, Kamal Ndousse, Kamile Lukosuite, Liane Lovitt, Michael Sellitto, Nelson Elhage, Nicholas Schiefer, Noemi Mercado, Nova DasSarma, Robert Lasenby, Robin Larson, Sam Ringer, Scott Johnston, Shauna Kravec, Sheer El Showk, Stanislav Fort, Tamera Lanham, Timothy Telleen-Lawton, Tom Conerly, Tom Henighan, Tristan Hume, Samuel R. Bowman, Zac Hatfield-Dodds, Ben Mann, Dario Amodei, Nicholas Joseph, Sam McCandlish, Tom Brown, Jared Kaplan | Published: 2022-12-15
アライメント
プロンプトインジェクション
性能評価

An Empirical Analysis of SMS Scam Detection Systems

Authors: Muhammad Salman, Muhammad Ikram, Mohamed Ali Kaafar | Published: 2022-10-19
メンバーシップ推論
性能評価
敵対的攻撃手法

Differentially Private Diffusion Models

Authors: Tim Dockhorn, Tianshi Cao, Arash Vahdat, Karsten Kreis | Published: 2022-10-18 | Updated: 2023-12-31
プライバシー評価
性能評価
生成的敵対ネットワーク

MaSS: Multi-attribute Selective Suppression

Authors: Chun-Fu Chen, Shaohan Hu, Zhonghao Shi, Prateek Gulati, Bill Moriarty, Marco Pistoia, Vincenzo Piuri, Pierangela Samarati | Published: 2022-10-18 | Updated: 2022-10-24
データプライバシー評価
ポイズニング
性能評価

Marksman Backdoor: Backdoor Attacks with Arbitrary Target Class

Authors: Khoa D. Doan, Yingjie Lao, Ping Li | Published: 2022-10-17
バックドア攻撃
性能評価

Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Authors: Xuan Gong, Liangchen Song, Rishi Vedula, Abhishek Sharma, Meng Zheng, Benjamin Planche, Arun Innanje, Terrence Chen, Junsong Yuan, David Doermann, Ziyan Wu | Published: 2022-10-16
プライバシーリスク管理
ポイズニング
性能評価

DI-NIDS: Domain Invariant Network Intrusion Detection System

Authors: Siamak Layeghy, Mahsa Baktashmotlagh, Marius Portmann | Published: 2022-10-15
性能評価
機械学習技術
深層学習手法

DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models

Authors: Zeyang Sha, Zheng Li, Ning Yu, Yang Zhang | Published: 2022-10-13 | Updated: 2023-01-09
データセット生成
性能評価
生成的敵対ネットワーク

Cross Project Software Vulnerability Detection via Domain Adaptation and Max-Margin Principle

Authors: Van Nguyen, Trung Le, Chakkrit Tantithamthavorn, John Grundy, Hung Nguyen, Dinh Phung | Published: 2022-09-19
モデル性能評価
学習の改善
性能評価

Prior Knowledge based Advanced Persistent Threats Detection for IoT in a Realistic Benchmark

Authors: Yu Shen, Murat Simsek, Burak Kantarci, Hussein T. Mouftah, Mehran Bagheri, Petar Djukic | Published: 2022-08-10
IoTセキュリティリスク
性能評価
機械学習手法