Neural Architecture Search (NAS) represents an emerging machine learning (ML)
paradigm that automatically searches for models tailored to given tasks, which
greatly simplifies the development of ML systems and propels the trend of ML
democratization. Yet, little is known about the potential security risks
incurred by NAS, which is concerning given the increasing use of NAS-generated
models in critical domains.
This work represents a solid initial step towards bridging the gap. Through
an extensive empirical study of 10 popular NAS methods, we show that compared
with their manually designed counterparts, NAS-generated models tend to suffer
greater vulnerability to various malicious attacks (e.g., adversarial evasion,
model poisoning, and functionality stealing). Further, with both empirical and
analytical evidence, we provide possible explanations for such phenomena: given
the prohibitive search space and training cost, most NAS methods favor models
that converge fast at early training stages; this preference results in
architectural properties associated with attack vulnerability (e.g., high loss
smoothness and low gradient variance). Our findings not only reveal the
relationships between model characteristics and attack vulnerability but also
suggest the inherent connections underlying different attacks. Finally, we
discuss potential remedies to mitigate such drawbacks, including increasing
cell depth and suppressing skip connects, which lead to several promising
research directions.