AIセキュリティポータル K Program
Dumviri: Detecting Trackers and Mixed Trackers with a Breakage Detector
Share
Abstract
Web tracking harms user privacy. As a result, the use of tracker detection and blocking tools is a common practice among Internet users. However, no such tool can be perfect, and thus there is a trade-off between avoiding breakage (caused by unintentionally blocking some required functionality) and neglecting to block some trackers. State-of-the-art tools usually rely on user reports and developer effort to detect breakages, which can be broadly categorized into two causes: 1) misidentifying non-trackers as trackers, and 2) blocking mixed trackers which blend tracking with functional components. We propose incorporating a machine learning-based breakage detector into the tracker detection pipeline to automatically avoid misidentification of functional resources. For both tracker detection and breakage detection, we propose using differential features that can more clearly elucidate the differences caused by blocking a request. We designed and implemented a prototype of our proposed approach, Duumviri, for non-mixed trackers. We then adopt it to automatically identify mixed trackers, drawing differential features at partial-request granularity. In the case of non-mixed trackers, evaluating Duumviri on 15K pages shows its ability to replicate the labels of human-generated filter lists, EasyPrivacy, with an accuracy of 97.44%. Through a manual analysis, we find that Duumviri can identify previously unreported trackers and its breakage detector can identify overly strict EasyPrivacy rules that cause breakage. In the case of mixed trackers, Duumviri is the first automated mixed tracker detector, and achieves a lower bound accuracy of 74.19%. Duumviri has enabled us to detect and confirm 22 previously unreported unique trackers and 26 unique mixed trackers.
Adreveal: Improving transparency into online targeted advertising
B. Liu, A. Sheth, U. Weinsberg, J. Chandrashekar, R. Govindan
Published: 2013
Selling off privacy at auction
C. Castelluccia, L. Olejnik, T. Minh-Dung
Published: 2014
Detecting filter list evasion with event-loop-turn granularity javascript signatures
Q. Chen, P. Snyder, B. Livshits, A. Kapravelos
Published: 2021
Jack-in-the-box: An empirical study of javascript bundling on the web and its security implications
J. Rack, C.-A. Staicu
Published: 2023
An automated approach for complementing ad blockers’ blacklists
D. Gugelmann, M. Happe, B. Ager, V. Lenders
Published: 2015
Towards Seamless Tracking-Free Web: Improved Detection of Trackers via One-class Learning
Muhammad Ikram, Hassan Jameel Asghar, Mohamed Ali Kaafar, Balachander Krishnamurthy, Anirban Mahanti
Published: 2016.3.21
Adgraph: A graph-based approach to ad and tracker blocking
U. Iqbal, P. Snyder, S. Zhu, B. Livshits, Z. Qian, Z. Shafiq
Published: 2020
Sugarcoat: Programmatically generating privacy-preserving, web-compatible resource replacements for content blocking
M. Smith, P. Snyder, B. Livshits, D. Stefan
Published: 2021
Blocked or broken? Automatically detecting when privacy interventions break websites
M. Smith, P. Snyder, M. Haller, B. Livshits, D. Stefan, H. Haddadi
Published: 2022
Defining “broken”: User experiences and remediation tactics when ad-blocking or tracking-protection tools break a website’s user experience
A. Nisenoff, A. Borem, M. Pickering, G. Nakanishi, M. Thumpasery, B. Ur
Published: 2023
PoolParty: Exploiting browser resource pools for web tracking
P. Snyder, S. Karami, A. Edelstein, B. Livshits, H. Haddadi
Published: 2023
Clustering web pages based on structure and style similarity (application paper)
T. Gowda, C. A. Mattmann
Published: 2016
Privaricator: Deceiving fingerprinters with little white lies
N. Nikiforakis, W. Joosen, B. Livshits
Published: 2015
EfficientNet: Rethinking model scaling for convolutional neural networks
Mingxing Tan, Quoc Le
Published: 2019
Jawa: Web archival in the era of JavaScript
A. Goel, J. Zhu, R. Netravali, H. V. Madhyastha
Published: 2022
Wtagraph: Web tracking and advertising detection
Z. Yang, W. Pei, M. Chen, C. Yue
Published: 2022
Tranco: A research-oriented top sites ranking hardened against manipulation
V. L. Pochat, T. Van Goethem, S. Tajalizadehkhoob, M. Korczynski, W. Joosen
Published: 2019
Leveraging machine learning to improve unwanted resource filtering
S. Bhagavatula, C. Dunn, C. Kanich, M. Gupta, B. Ziebart
Published: 2014
Cookie synchronization: Everything you always wanted to know but were afraid to ask
P. Papadopoulos, N. Kourtellis, E. Markatos
Published: 2019
Online tracking: A 1-million-site measurement and analysis
S. Englehardt, A. Narayanan
Published: 2016
Towards accurate detection of obfuscated web tracking
H. Le, F. Fallace, P. Barlet-Ros
Published: 2017
Towards automatic identification of javascript-oriented machine-based tracking
A. J. Kaizer, M. Gupta
Published: 2016
A machine learning approach for detecting third-party trackers on the web
Q. Wu, Q. Liu, Y. Zhang, P. Liu, G. Wen
Published: 2016
Trackersift: Untangling mixed tracking and functional web resources
A. H. Amjad, D. Saleem, M. A. Gulzar, Z. Shafiq, F. Zaffar
Published: 2021
Characterizing the use of browser-based blocking extensions to prevent online tracking
A. Mathur, J. Vitak, A. Narayanan, M. Chetty
Published: 2018
Sinbad: Saliency-informed detection of breakage caused by ad blocking
S. E. H. Chehade, S. Siby, C. Troncoso
Published: 2024
Vips: a vision-based page segmentation algorithm
D. Cai, S. Yu, J.-R. Wen, W.-Y. Ma
Published: 2003
Towards an improved vision-based web page segmentation algorithm
M. Cormer, R. Mann, K. Moffatt, R. Cohen
Published: 2017
Share