Cyber-physical systems (CPS) consist of sensors, actuators, and controllers
all communicating over a network; if any subset becomes compromised, an
attacker could cause significant damage. With access to data logs and a model
of the CPS, the physical effects of an attack could potentially be detected
before any damage is done. Manually building a model that is accurate enough in
practice, however, is extremely difficult. In this paper, we propose a novel
approach for constructing models of CPS automatically, by applying supervised
machine learning to data traces obtained after systematically seeding their
software components with faults ("mutants"). We demonstrate the efficacy of
this approach on the simulator of a real-world water purification plant,
presenting a framework that automatically generates mutants, collects data
traces, and learns an SVM-based model. Using cross-validation and statistical
model checking, we show that the learnt model characterises an invariant
physical property of the system. Furthermore, we demonstrate the usefulness of
the invariant by subjecting the system to 55 network and code-modification
attacks, and showing that it can detect 85% of them from the data logs
generated at runtime.