While neural networks demonstrate stronger capabilities in pattern
recognition nowadays, they are also becoming larger and deeper. As a result,
the effort needed to train a network also increases dramatically. In many
cases, it is more practical to use a neural network intellectual property (IP)
that an IP vendor has already trained. As we do not know about the training
process, there can be security threats in the neural IP: the IP vendor
(attacker) may embed hidden malicious functionality, i.e. neural Trojans, into
the neural IP. We show that this is an effective attack and provide three
mitigation techniques: input anomaly detection, re-training, and input
preprocessing. All the techniques are proven effective. The input anomaly
detection approach is able to detect 99.8% of Trojan triggers although with
12.2% false positive. The re-training approach is able to prevent 94.1% of
Trojan triggers from triggering the Trojan although it requires that the neural
IP be reconfigurable. In the input preprocessing approach, 90.2% of Trojan
triggers are rendered ineffective and no assumption about the neural IP is
needed.