文献データベース

JavelinGuard: Low-Cost Transformer Architectures for LLM Security

Authors: Yash Datta, Sharath Rajasekar | Published: 2025-06-09

プライバシー保護技術

プロンプトインジェクション

モデルアーキテクチャ

2025.06.09

文献データベース

Auditing Black-Box LLM APIs with a Rank-Based Uniformity Test

Authors: Xiaoyuan Zhu, Yaowen Ye, Tianyi Qiu, Hanlin Zhu, Sijun Tan, Ajraf Mannan, Jonathan Michala, Raluca Ada Popa, Willie Neiswanger | Published: 2025-06-08 | Updated: 2025-06-11

APIセキュリティ

評価手法

選択手法

2025.06.08

文献データベース

Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation

Authors: Jaechul Roh, Varun Gandhi, Shivani Anilkumar, Arin Garg | Published: 2025-06-08 | Updated: 2025-06-12

パフォーマンス評価

プロンプトインジェクション

プロンプトリーキング

2025.06.08

文献データベース

The Scales of Justitia: A Comprehensive Survey on Safety Evaluation of LLMs

Authors: Songyang Liu, Chaozhuo Li, Jiameng Qiu, Xi Zhang, Feiran Huang, Litian Zhang, Yiming Hei, Philip S. Yu | Published: 2025-06-06 | Updated: 2025-10-30

アライメント

大規模言語モデル

安全性評価

2025.06.06

文献データベース

Watermarking Degrades Alignment in Language Models: Analysis and Mitigation

Authors: Apurv Verma, NhatHai Phan, Shubhendu Trivedi | Published: 2025-06-04 | Updated: 2025-07-10

性能評価指標

生成AI向け電子透かし

透かし

2025.06.04

文献データベース

TracLLM: A Generic Framework for Attributing Long Context LLMs

Authors: Yanting Wang, Wei Zou, Runpeng Geng, Jinyuan Jia | Published: 2025-06-04

LLMとの協力効果

RAGへのポイズニング攻撃

効率評価

2025.06.04

文献データベース

Privacy and Security Threat for OpenAI GPTs

Authors: Wei Wenying, Zhao Kaifa, Xue Lei, Fan Ming | Published: 2025-06-04

LLMの安全機構の解除

プライバシー問題

防御メカニズム

2025.06.04

文献データベース

Evaluating Apple Intelligence’s Writing Tools for Privacy Against Large Language Model-Based Inference Attacks: Insights from Early Datasets

Authors: Mohd. Farhan Israk Soumik, Syed Mhamudul Hasan, Abdur R. Shahid | Published: 2025-06-04

テキスト分類の応用

プライバシー問題

プロンプトインジェクション

2025.06.04

文献データベース

Client-Side Zero-Shot LLM Inference for Comprehensive In-Browser URL Analysis

Authors: Avihay Cohen | Published: 2025-06-04

アライメント

プロンプトインジェクション

動的分析

2025.06.04

文献データベース

A Threat Intelligence Event Extraction Conceptual Model for Cyber Threat Intelligence Feeds

Authors: Jamal H. Al-Yasiri, Mohamad Fadli Bin Zolkipli, Nik Fatinah N Mohd Farid, Mohammed Alsamman, Zainab Ali Mohammed | Published: 2025-06-04

サイバー脅威

効率評価

情報抽出手法

2025.06.04

文献データベース