Abstract
How can we find patterns and anomalies in a tensor, or multi-dimensional
array, in an efficient and directly interpretable way? How can we do this in an
online environment, where a new tensor arrives each time step? Finding patterns
and anomalies in a tensor is a crucial problem with many applications,
including building safety monitoring, patient health monitoring, cyber
security, terrorist detection, and fake user detection in social networks.
Standard PARAFAC and Tucker decomposition results are not directly
interpretable. Although a few sampling-based methods have previously been
proposed towards better interpretability, they need to be made faster, more
memory efficient, and more accurate.
In this paper, we propose CTD, a fast, accurate, and directly interpretable
tensor decomposition method based on sampling. CTD-S, the static version of
CTD, provably guarantees a high accuracy that is 17 ~ 83x more accurate than
that of the state-of-the-art method. Also, CTD-S is made 5 ~ 86x faster, and 7
~ 12x more memory-efficient than the state-of-the-art method by removing
redundancy. CTD-D, the dynamic version of CTD, is the first interpretable
dynamic tensor decomposition method ever proposed. Also, it is made 2 ~ 3x
faster than already fast CTD-S by exploiting factors at previous time step and
by reordering operations. With CTD, we demonstrate how the results can be
effectively interpreted in the online distributed denial of service (DDoS)
attack detection.