Recurrent Neural Networks (RNNs) have been shown to be valuable for
constructing Intrusion Detection Systems (IDSs) for network data. They allow
determining if a flow is malicious or not already before it is over, making it
possible to take action immediately. However, considering the large number of
packets that has to be inspected, for example in cloud/fog and edge computing,
the question of computational efficiency arises. We show that by using a novel
Reinforcement Learning (RL)-based approach called SparseIDS, we can reduce the
number of consumed packets by more than three fourths while keeping
classification accuracy high. To minimize the computational expenses of the
RL-based sampling we show that a shared neural network can be used for both the
classifier and the RL logic. Thus, no additional resources are consumed by the
sampling in deployment. Comparing to various other sampling techniques,
SparseIDS consistently achieves higher classification accuracy by learning to
sample only relevant packets. A major novelty of our RL-based approach is that
it can not only skip up to a predefined maximum number of samples like other
approaches proposed in the domain of Natural Language Processing but can even
skip arbitrarily many packets in one step. This enables saving even more
computational resources for long sequences. Inspecting SparseIDS's behavior of
choosing packets shows that it adopts different sampling strategies for
different attack types and network flows. Finally we build an automatic
steering mechanism that can guide SparseIDS in deployment to achieve a desired
level of sparsity.