Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

TOP Literature Database Early-Stage Anomaly Detection: A Study of Model Performance on Complete vs. Partial Flows

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2407.02856

PDF

https://arxiv.org/pdf/2407.02856

Paper Information

Author: Adrian Pekar,Richard Jozsa
Published: 7-3-2024
Updated: 6-30-2025
Affiliation: Budapest University of Technology and Economics
Country: Hungary
Conference

Labels Estimated by AI

Intrusion Detection System Traffic Classification Performance Evaluation Metrics

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

This study investigates the efficacy of machine learning models in network security threat detection through the critical lens of partial versus complete flow information, addressing a common gap between research settings and real-time operational needs. We systematically evaluate how a standard benchmark model, Random Forest, performs under varying training and testing conditions (complete/complete, partial/partial, complete/partial), quantifying the performance impact when dealing with the incomplete data typical in real-time environments. Our findings demonstrate a significant performance difference, with precision and recall dropping by up to 30% under certain conditions when models trained on complete flows are tested against partial flows. The study also reveals that, for the evaluated dataset and model, a minimum threshold around 7 packets in the test set appears necessary for maintaining reliable detection rates, providing valuable, quantified insights for developing more realistic real-time detection strategies.

External Datasets

CICIDS-2017