AIセキュリティポータル K Program
Inferring Discussion Topics about Exploitation of Vulnerabilities from Underground Hacking Forums
Share
Abstract
The increasing sophistication of cyber threats necessitates proactive measures to identify vulnerabilities and potential exploits. Underground hacking forums serve as breeding grounds for the exchange of hacking techniques and discussions related to exploitation. In this research, we propose an innovative approach using topic modeling to analyze and uncover key themes in vulnerabilities discussed within these forums. The objective of our study is to develop a machine learning-based model that can automatically detect and classify vulnerability-related discussions in underground hacking forums. By monitoring and analyzing the content of these forums, we aim to identify emerging vulnerabilities, exploit techniques, and potential threat actors. To achieve this, we collect a large-scale dataset consisting of posts and threads from multiple underground forums. We preprocess and clean the data to ensure accuracy and reliability. Leveraging topic modeling techniques, specifically Latent Dirichlet Allocation (LDA), we uncover latent topics and their associated keywords within the dataset. This enables us to identify recurring themes and prevalent discussions related to vulnerabilities, exploits, and potential targets.
Economic factors of vulnerability trade and exploitation
Allodi, L.
Published: 2017
Measuring the changing cost of cybercrime
Anderson, R., et al.
Published: 2019
Threats from the dark: a review over dark web investigation research for cyber threat intelligence
Basheer, R., Alkhatib, B.
Published: 2021
Dynamic topic models
Blei, D.M., Lafferty, J.D.
Published: 2006
Latent dirichlet allocation
Blei, D.M., Ng, A., Jordan, M.I.
Published: 2001
Towards automated dynamic analysis for linux-based embedded firmware
Chen, D.D., Woo, M., Brumley, D., Egele, M.
Published: 2016
Predicting vulnerability exploits in the wild
Edkrantz, M., Truve, S., Said, A.
Published: 2015
Probabilistic latent semantic analysis
Hofmann, T.
Published: 1999
Learning the parts of objects by non-negative matrix factorization
Lee, D.D., Seung, H.S.
Published: 1999
Car monitoring system in apartments’ garages by small autonomous car using deep learning
Leon-Vera, L., Moreno-Vera, F.
Published: 2018
Fuzzing: State of the art
Liang, H., Pei, X., Jia, X., Shen, W., Zhang, J.
Published: 2018
Morarch: A software architecture for interoperability to improve the communication in the edge layer of a smart iot ecosystem
Moreno-Motta, J., Moreno-Vera, F., Moreno, F.A.
Published: 2022
Understanding safety based on urban perception
Moreno-Vera, F.
Published: 2021
Cream Skimming the Underground: Identifying Relevant Information Points from Online Forums
Felipe Moreno-Vera, Mateus Nogueira, Cainã Figueiredo, Daniel Sadoc Menasché, Miguel Bicudo, Ashton Woiwood, Enrico Lovat, Anton Kocheturov, Leandro Pfleger de Aguiar
Published: 8.4.2023
Gensim–python framework for vector space modelling
R. Rehurek, P. Sojka
Published: 2011
The future of cybercrime & security
Susan Morrow, T.C.
Published: 2019
Fuzzing: Brute force vulnerability discovery
Sutton, M.S., Greene, A.R., Amini, P.F.
Published: 2007
Hierarchical dirichlet processes
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.
Published: 2006
Share