Gathering cyber threat intelligence from open sources is becoming
increasingly important for maintaining and achieving a high level of security
as systems become larger and more complex. However, these open sources are
often subject to information overload. It is therefore useful to apply machine
learning models that condense the amount of information to what is necessary.
Yet, previous studies and applications have shown that existing classifiers are
not able to extract specific information about emerging cybersecurity events
due to their low generalization ability. Therefore, we propose a system to
overcome this problem by training a new classifier for each new incident. Since
this requires a lot of labelled data using standard training methods, we
combine three different low-data regime techniques - transfer learning, data
augmentation, and few-shot learning - to train a high-quality classifier from
very few labelled instances. We evaluated our approach using a novel dataset
derived from the Microsoft Exchange Server data breach of 2021 which was
labelled by three experts. Our findings reveal an increase in F1 score of more
than 21 points compared to standard training methods and more than 18 points
compared to a state-of-the-art method in few-shot learning. Furthermore, the
classifier trained with this method and 32 instances is only less than 5 F1
score points worse than a classifier trained with 1800 instances.