A Modular and Adaptive System for Business Email Compromise Detection

TOP Literature Database A Modular and Adaptive System for Business Email Compromise Detection

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2308.10776

PDF

https://arxiv.org/pdf/2308.10776

Paper Information

Author: Jan Brabec;Filip Šrajer;Radek Starosta;Tomáš Sixta;Marc Dupont;Miloš Lenoch;Jiří Menšík;Florian Becker;Jakub Boros;Tomáš Pop;Pavel Novák
Published: 8-22-2023
Affiliation: Cisco Systems
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Phishing Detection Business Email Compromise Performance Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The growing sophistication of Business Email Compromise (BEC) and spear phishing attacks poses significant challenges to organizations worldwide. The techniques featured in traditional spam and phishing detection are insufficient due to the tailored nature of modern BEC attacks as they often blend in with the regular benign traffic. Recent advances in machine learning, particularly in Natural Language Understanding (NLU), offer a promising avenue for combating such attacks but in a practical system, due to limitations such as data availability, operational costs, verdict explainability requirements or a need to robustly evolve the system, it is essential to combine multiple approaches together. We present CAPE, a comprehensive and efficient system for BEC detection that has been proven in a production environment for a period of over two years. Rather than being a single model, CAPE is a system that combines independent ML models and algorithms detecting BEC-related behaviors across various email modalities such as text, images, metadata and the email's communication context. This decomposition makes CAPE's verdicts naturally explainable. In the paper, we describe the design principles and constraints behind its architecture, as well as the challenges of model design, evaluation and adapting the system continuously through a Bayesian approach that combines limited data with domain knowledge. Furthermore, we elaborate on several specific behavioral detectors, such as those based on Transformer neural architectures.

External Datasets

Enron dataset