AIセキュリティポータルbot

A Primer on Bayesian Neural Networks: Review and Debates

Authors: Julyan Arbel, Konstantinos Pitas, Mariia Vladimirova, Vincent Fortuin | Published: 2023-09-28
アルゴリズム
サンプリング手法
モデル選択

Breaking On-Chip Communication Anonymity using Flow Correlation Attacks

Authors: Hansika Weerasena, Prabhat Mishra | Published: 2023-09-27 | Updated: 2024-02-01
性能評価
流量相関攻撃
防御手法

Watch Your Language: Investigating Content Moderation with Large Language Models

Authors: Deepak Kumar, Yousef AbuHashem, Zakir Durumeric | Published: 2023-09-25 | Updated: 2024-01-17
LLM性能評価
プロンプトインジェクション
不適切コンテンツ生成

Byzantine-Resilient Federated PCA and Low Rank Column-wise Sensing

Authors: Ankit Pratap Singh, Namrata Vaswani | Published: 2023-09-25 | Updated: 2024-08-09
ポイズニング
次元削減手法
連合学習

LinGCN: Structural Linearized Graph Convolutional Network for Homomorphically Encrypted Inference

Authors: Hongwu Peng, Ran Ran, Yukui Luo, Jiahui Zhao, Shaoyi Huang, Kiran Thorat, Tong Geng, Chenghong Wang, Xiaolin Xu, Wujie Wen, Caiwen Ding | Published: 2023-09-25 | Updated: 2023-10-04
ウォーターマーキング
性能評価
深層学習手法

Can LLM-Generated Misinformation Be Detected?

Authors: Canyu Chen, Kai Shu | Published: 2023-09-25 | Updated: 2024-04-23
LLM性能評価
プロンプトインジェクション
不適切コンテンツ生成

Unbiased Watermark for Large Language Models

Authors: Zhengmian Hu, Lichang Chen, Xidong Wu, Yihan Wu, Hongyang Zhang, Heng Huang | Published: 2023-09-22 | Updated: 2023-10-18
ウォーターマーキング
モデル性能評価
統計的仮説検定

The Reversal Curse: LLMs trained on “A is B” fail to learn “B is A”

Authors: Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans | Published: 2023-09-21 | Updated: 2024-05-26
ハルシネーション
モデル評価
学習データの偏り

Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Authors: Xinyu Tang, Richard Shin, Huseyin A. Inan, Andre Manoel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Robert Sim | Published: 2023-09-21 | Updated: 2024-01-28
データ保護手法
データ生成
プライバシー手法

How Robust is Google’s Bard to Adversarial Image Attacks?

Authors: Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, Jun Zhu | Published: 2023-09-21 | Updated: 2023-10-14
敵対的訓練
防御手法
顔認識