Cyber-security garnered significant attention due to the increased dependency
of individuals and organizations on the Internet and their concern about the
security and privacy of their online activities. Several previous machine
learning (ML)-based network intrusion detection systems (NIDSs) have been
developed to protect against malicious online behavior. This paper proposes a
novel multi-stage optimized ML-based NIDS framework that reduces computational
complexity while maintaining its detection performance. This work studies the
impact of oversampling techniques on the models' training sample size and
determines the minimal suitable training sample size. Furthermore, it compares
between two feature selection techniques, information gain and
correlation-based, and explores their effect on detection performance and time
complexity. Moreover, different ML hyper-parameter optimization techniques are
investigated to enhance the NIDS's performance. The performance of the proposed
framework is evaluated using two recent intrusion detection datasets, the
CICIDS 2017 and the UNSW-NB 2015 datasets. Experimental results show that the
proposed model significantly reduces the required training sample size (up to
74%) and feature set size (up to 50%). Moreover, the model performance is
enhanced with hyper-parameter optimization with detection accuracies over 99%
for both datasets, outperforming recent literature works by 1-2% higher
accuracy and 1-2% lower false alarm rate.