CTI-HAL: A Human-Annotated Dataset for Cyber Threat Intelligence Analysis

TOP 文献データベース CTI-HAL: A Human-Annotated Dataset for Cyber Threat Intelligence Analysis

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2504.05866

PDF

https://arxiv.org/pdf/2504.05866

文献情報

作者: Sofia Della Penna,Roberto Natella,Vittorio Orbinato,Lorenzo Parracino,Luciano Pianese
公開日: 2025-4-8
所属機関: DIETI, Universita degli Studi di Napoli Federico II
所属の国: Italy
会議名: European Symposium on Security and Privacy (EuroS&P)

AIにより推定されたラベル

モデル性能評価大規模言語モデル LLMの応用

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Organizations are increasingly targeted by Advanced Persistent Threats (APTs), which involve complex, multi-stage tactics and diverse techniques. Cyber Threat Intelligence (CTI) sources, such as incident reports and security blogs, provide valuable insights, but are often unstructured and in natural language, making it difficult to automatically extract information. Recent studies have explored the use of AI to perform automatic extraction from CTI data, leveraging existing CTI datasets for performance evaluation and fine-tuning. However, they present challenges and limitations that impact their effectiveness. To overcome these issues, we introduce a novel dataset manually constructed from CTI reports and structured according to the MITRE ATT&CK framework. To assess its quality, we conducted an inter-annotator agreement study using Krippendorff alpha, confirming its reliability. Furthermore, the dataset was used to evaluate a Large Language Model (LLM) in a real-world business context, showing promising generalizability.

外部データセット

CTI-HAL