Unveiling the Digital Fingerprints: Analysis of Internet attacks based on website fingerprints

TOP Literature Database Unveiling the Digital Fingerprints: Analysis of Internet attacks based on website fingerprints

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2409.03791

PDF

https://arxiv.org/pdf/2409.03791

Paper Information

Author: Blerim Rexha;Arbena Musa;Kamer Vishi;Edlira Martiri
Published: 9-2-2024
Affiliation: University of Prishtina
Country: Kosovo
Conference: Int. J. Inf. Comput. Secur.

Labels Estimated by AI

Data Collection Privacy Protection Fingerprinting Method

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Parallel to our physical activities our virtual presence also leaves behind our unique digital fingerprints, while navigating on the Internet. These digital fingerprints have the potential to unveil users' activities encompassing browsing history, utilized applications, and even devices employed during these engagements. Many Internet users tend to use web browsers that provide the highest privacy protection and anonymization such as Tor. The success of such privacy protection depends on the Tor feature to anonymize end-user IP addresses and other metadata that constructs the website fingerprint. In this paper, we show that using the newest machine learning algorithms an attacker can deanonymize Tor traffic by applying such techniques. In our experimental framework, we establish a baseline and comparative reference point using a publicly available dataset from Universidad Del Cauca, Colombia. We capture network packets across 11 days, while users navigate specific web pages, recording data in .pcapng format through the Wireshark network capture tool. Excluding extraneous packets, we employ various machine learning algorithms in our analysis. The results show that the Gradient Boosting Machine algorithm delivers the best outcomes in binary classification, achieving an accuracy of 0.8363. In the realm of multi-class classification, the Random Forest algorithm attains an accuracy of 0.6297.

External Datasets

Universidad Del Cauca network traffic dataset