AIセキュリティポータル K Program
Measuring and Modeling the Free Content Web
Share
Abstract
Free content websites that provide free books, music, games, movies, etc., have existed on the Internet for many years. While it is a common belief that such websites might be different from premium websites providing the same content types, an analysis that supports this belief is lacking in the literature. In particular, it is unclear if those websites are as safe as their premium counterparts. In this paper, we set out to investigate, by analysis and quantification, the similarities and differences between free content and premium websites, including their risk profiles. To conduct this analysis, we assembled a list of 834 free content websites offering books, games, movies, music, and software, and 728 premium websites offering content of the same type. We then contribute domain-, content-, and risk-level analysis, examining and contrasting the websites' domain names, creation times, SSL certificates, HTTP requests, page size, average load time, and content type. For risk analysis, we consider and examine the maliciousness of these websites at the website- and component-level. Among other interesting findings, we show that free content websites tend to be vastly distributed across the TLDs and exhibit more dynamics with an upward trend for newly registered domains. Moreover, the free content websites are 4.5 times more likely to utilize an expired certificate, 19 times more likely to be malicious at the website level, and 2.64 times more likely to be malicious at the component level. Encouraged by the clear differences between the two types of websites, we explore the automation and generalization of the risk modeling of the free content risky websites, showing that a simple machine learning-based technique can produce 86.81\% accuracy in identifying them.
Setting up shop: The business of open-source software
F. Hecker
Published: 1999
No library required: the free and easy backwaters of online content sharing
K. Greenhill, C. Wiebrands
Crowdfunding and nonprofit media: The emergence of new models for public interest journalism
M. Carvajal, J. A. García-Avilés, J. L. González
Published: 2012
The profits of free books: an experiment to measure the impact of open access publishing
R. Snijder
Published: 2010
Third-party web tracking: Policy and technology
J. R. Mayer, J. C. Mitchell
Published: 2012
A study of the impacts of website quality on customer relationship performance
C.-J. Liang, H.-J. Chen
Published: 2009
Customer willingness to pay for online music: The role of free mentality
T.-C. Lin, J. S.-C. Hsu, H.-C. Chen
Measuring and assessing the risks of free content websites
A. Alabduljabbar, R. Ma, S. Alshamrani, R. Jang, S. Chen, D. Mohaisen
Published: 2022
Understanding the security of free content websites by analyzing their SSL certificates: A comparative study
A. Alabduljabbar, R. Ma, S. Choi, R. Jang, S. Chen, D. Mohaisen
Published: 2022
Measuring the Privacy Dimension of Free Content Websites through Automated Privacy Policy Analysis and Annotation
A. Alabduljabbar, D. Mohaisen
Published: 2022
Knowing your enemy: understanding and detecting malicious web advertising
Z. Li, K. Zhang, Y. Xie, F. Yu, X. Wang
Published: 2012
Measuring and Applying Invalid SSL Certificates: The Silent Majority
T. Chung, Y. Liu, D. R. Choffnes, D. Levin, B. M. Maggs, A. Mislove
Published: 2016
HTTP security headers analysis of top one million websites
A. Lavrenovs, F. J. R. Melon
Published: 2018
Website security analysis: variation of detection methods and decisions
I. Alsmadi, F. Mira
Published: 2018
Adaptive malicious URL detection: Learning in the presence of concept drifts
G. Tan, P. Zhang, Q. Liu, X. Liu, C. Zhu, F. Dou
Published: 2018
Automated malicious advertisement detection using virustotal, urlvoid, and trendmicro
R. Masri, M. Aldwairi
Published: 2017
Isolating malicious content scripts of browser extensions
K. Patil
Published: 2017
Javascript malware detection using A high-level fuzzy petri net
V. R. L. Shen, C. Wei, T. T. Juang
Published: 2018
Detection of malicious web pages based on hybrid analysis
R. Wang, Y. Zhu, J. Tan, B. Zhou
Published: 2017
Towards detecting and classifying malicious urls using deep learning
C. Johnson, B. Khadka, R. B. Basnet, T. Doleck
Published: 2020
Malicious web content detection using machine learning
A. Desai, J. Jatakia, R. Naik, N. Raul
Published: 2017
A comprehensive evaluation of HTTP header features for detecting malicious websites
J. M. IV, D. Bhansali, M. Gratian, M. Cukier
Published: 2019
Knowing your enemy: understanding and detecting malicious web advertising
Z. Li, K. Zhang, Y. Xie, F. Yu, X. Wang
Published: 2012
Detecting and classifying android malware using static analysis along with creator information
H. Kang, J.-w. Jang, A. Mohaisen, H. K. Kim
Published: 2015
Analyzing and detecting emerging internet of things malware: A graph-based approach
H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad, D. Nyang, A. Mohaisen
Automated malicious advertisement detection using virustotal, urlvoid, and trendmicro
R. Masri, M. Aldwairi
Published: 2017
A large-scale analysis of download portals and freeware installers
A. Geniola, M. Antikainen, T. Aura
Published: 2017
Costly freeware: a systematic analysis of abuse in download portals
R. Rivera, P. Kotzias, A. Sudhodanan, J. Caballero
Published: 2019
A machine learning approach for detecting malicious websites using url features
A. S. Manjeri, R. Kaushik, M. Ajay, P. C. Nair
Published: 2019
A comparison of machine learning attributes for detecting malicious websites
A. K. Singh, N. Goyal
Published: 2019
CNN based malicious website detection by invalidating multiple web spams
D. Liu, J. Lee
Published: 2020
Maljpeg: Machine learning based solution for the detection of malicious JPEG images
A. Cohen, N. Nissim, Y. Elovici
Published: 2020
Detecting and defending against seizure-inducing gifs in social media
L. South, D. Saffo, M. A. Borkin
Published: 2021
Malfire: Malware firewall for malicious content detection and protection
W. Yost, C. Jaiswal
Published: 2017
Imagedetox: Method for the neutralization of malicious code hidden in image files
D. Jung, S. Lee, I. Euom
Published: 2020
Metadata-driven threat classification of network endpoints appearing in malware
A. G. West, A. Mohaisen
Published: 2014
ADAM: Automated Detection and Attribution of Malicious Webpages
A. E. Kosba, A. Mohaisen, A. G. West, T. Tonn, H. K. Kim
Published: 2014
AMAL: high-fidelity, behavior-based automated malware analysis and classification
A. Mohaisen, O. Alrawi
Published: 2014
Towards automatic and lightweight detection and classification of malicious web contents
A. Mohaisen
Published: 2015
Share