AIセキュリティマップにマッピングされた外部作用的側面における負の影響「AIにより偽情報を作成」のセキュリティ対象、それをもたらす攻撃・要因、および防御手法・対策を示しています。
セキュリティ対象
- 非消費者
- 社会
攻撃・要因
- 可用性の悪用
- 精度の悪用
- 制御可能性の毀損
- ディープフェイク
- ソーシャルエンジニアリング攻撃
防御手法・対策
参考文献
ディープフェイク
- Face2Face: Real-time Face Capture and Reenactment of RGB Videos, 2016
- Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, 2017
- AttGAN: Facial Attribute Editing by Only Changing What You Want, 2017
- FSGAN: Subject Agnostic Face Swapping and Reenactment, 2019
- STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing, 2019
- A Style-Based Generator Architecture for Generative Adversarial Networks, 2019
- Few-Shot Adversarial Learning of Realistic Neural Talking Head Models, 2019
ソーシャルエンジニアリング攻撃
アライメント
- Training language models to follow instructions with human feedback, 2022
- Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback, 2022
- Constitutional AI: Harmlessness from AI Feedback, 2022
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model, 2023
- A General Theoretical Paradigm to Understand Learning from Human Preferences, 2023
- RRHF: Rank Responses to Align Language Models with Human Feedback without tears, 2023
- Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations, 2023
- Self-Rewarding Language Models, 2024
- KTO: Model Alignment as Prospect Theoretic Optimization, 2024
- SimPO: Simple Preference Optimization with a Reference-Free Reward, 2024
生成AI向け電子透かし
暗号化技術
- Gazelle: A Low Latency Framework for Secure Neural Network Inference, 2018
- Faster CryptoNets: Leveraging Sparsity for Real-World Encrypted Inference, 2018
- nGraph-HE2: A High-Throughput Framework for Neural Network Inference on Encrypted Data, 2019
- Privacy-Preserving Machine Learning with Fully Homomorphic Encryption for Deep Neural Network, 2021
AIによる出力の識別
- Defending Against Neural Fake News, 2019
- Real or Fake? Learning to Discriminate Machine from Human Generated Text, 2019
- Automatic Detection of Generated Text is Easiest when Humans are Fooled, 2020
- DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature, 2023
- Inspection and Control of Self-Generated-Text Recognition Ability in Llama3-8b-Instruct, 2025
偽情報の検出
- Fake News Detection on Social Media: A Data Mining Perspective, 2017
- CSI: A Hybrid Deep Model for Fake News Detection, 2017
- Towards Few-Shot Fact-Checking via Perplexity, 2021
- Fact-Checking Complex Claims with Program-Guided Reasoning, 2023
- Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method, 2023
ディープフェイクの検知
- Two-Stream Neural Networks for Tampered Face Detection, 2017
- Exposing DeepFake Videos By Detecting Face Warping Artifacts, 2019
- Exposing Deep Fakes Using Inconsistent Head Poses, 2019
- CNN-generated images are surprisingly easy to spot… for now, 2020
- Face X-ray for More General Face Forgery Detection, 2020
- FakeCatcher: Detection of Synthetic Portrait Videos using Biological Signals, 2020
- End-to-end anti-spoofing with RawNet2, 2021