In the Internet of Things (IoT) environment, continuous interaction among a
large number of devices generates complex and dynamic network traffic, which
poses significant challenges to rule-based detection approaches. Machine
learning (ML)-based traffic detection technology, capable of identifying
anomalous patterns and potential threats within this traffic, serves as a
critical component in ensuring network security. This study first identifies a
significant issue with widely adopted feature extraction tools (e.g.,
CICMeterFlow): the extensive use of time- and length-related features leads to
high sparsity, which adversely affects model convergence. Furthermore, existing
traffic detection methods generally lack an embedding mechanism capable of
efficiently and comprehensively capturing the semantic characteristics of
network traffic. To address these challenges, we propose a novel feature
extraction tool that eliminates traditional time and length features in favor
of context-aware semantic features related to the source host, thus improving
the generalizability of the model. In addition, we design an embedding training
framework that integrates the unsupervised DBSCAN clustering algorithm with a
contrastive learning strategy to effectively capture fine-grained semantic
representations of traffic. Extensive empirical evaluations are conducted on
the real-world Mawi data set to validate the proposed method in terms of
detection accuracy, robustness, and generalization. Comparative experiments
against several state-of-the-art (SOTA) models demonstrate the superior
performance of our approach. Furthermore, we confirm its applicability and
deployability in real-time scenarios.