There has been a rise in the use of Machine Learning as a Service (MLaaS)
Vision APIs as they offer multiple services including pre-built models and
algorithms, which otherwise take a huge amount of resources if built from
scratch. As these APIs get deployed for high-stakes applications, it's very
important that they are robust to different manipulations. Recent works have
only focused on typical adversarial attacks when evaluating the robustness of
vision APIs. We propose two new aspects of adversarial image generation methods
and evaluate them on the robustness of Google Cloud Vision API's optical
character recognition service and object detection APIs deployed in real-world
settings such as sightengine.com, picpurify.com, Google Cloud Vision API, and
Microsoft Azure's Computer Vision API. Specifically, we go beyond the
conventional small-noise adversarial attacks and introduce secret embedding and
transparent adversarial examples as a simpler way to evaluate robustness. These
methods are so straightforward that even non-specialists can craft such
attacks. As a result, they pose a serious threat where APIs are used for
high-stakes applications. Our transparent adversarial examples successfully
evade state-of-the art object detections APIs such as Azure Cloud Vision
(attack success rate 52%) and Google Cloud Vision (attack success rate 36%).
90% of the images have a secret embedded text that successfully fools the
vision of time-limited humans but is detected by Google Cloud Vision API's
optical character recognition. Complementing to current research, our results
provide simple but unconventional methods on robustness evaluation.