A growing issue in the modern cyberspace world is the direct identification
of malicious activity over network connections. The boom of the machine
learning industry in the past few years has led to the increasing usage of
machine learning technologies, which are especially prevalent in the network
intrusion detection research community. When utilizing these fairly
contemporary techniques, the community has realized that datasets are pivotal
for identifying malicious packets and connections, particularly ones associated
with information concerning labeling in order to construct learning models.
However, there exists a shortage of publicly available, relevant datasets to
researchers in the network intrusion detection community. Thus, in this paper,
we introduce a method to construct labeled flow data by combining the packet
meta-information with IDS logs to infer labels for intrusion detection
research. Specifically, we designed a NetFlow-compatible format due to the
capability of a a large body of network devices, such as routers and switches,
to export NetFlow records from raw traffic. In doing so, the introduced method
at hand would aid researchers to access relevant network flow datasets along
with label information.