Using mathematical modeling and human subjects experiments, this research
explores the extent to which emerging webcams might leak recognizable textual
and graphical information gleaming from eyeglass reflections captured by
webcams. The primary goal of our work is to measure, compute, and predict the
factors, limits, and thresholds of recognizability as webcam technology evolves
in the future. Our work explores and characterizes the viable threat models
based on optical attacks using multi-frame super resolution techniques on
sequences of video frames. Our models and experimental results in a controlled
lab setting show it is possible to reconstruct and recognize with over 75%
accuracy on-screen texts that have heights as small as 10 mm with a 720p
webcam. We further apply this threat model to web textual contents with varying
attacker capabilities to find thresholds at which text becomes recognizable.
Our user study with 20 participants suggests present-day 720p webcams are
sufficient for adversaries to reconstruct textual content on big-font websites.
Our models further show that the evolution towards 4K cameras will tip the
threshold of text leakage to reconstruction of most header texts on popular
websites. Besides textual targets, a case study on recognizing a closed-world
dataset of Alexa top 100 websites with 720p webcams shows a maximum recognition
accuracy of 94% with 10 participants even without using machine-learning
models. Our research proposes near-term mitigations including a software
prototype that users can use to blur the eyeglass areas of their video streams.
For possible long-term defenses, we advocate an individual reflection testing
procedure to assess threats under various settings, and justify the importance
of following the principle of least privilege for privacy-sensitive scenarios.