In this paper, we build a speech privacy attack that exploits speech
reverberations generated from a smartphone's in-built loudspeaker captured via
a zero-permission motion sensor (accelerometer). We design our attack
Spearphone2, and demonstrate that speech reverberations from inbuilt
loudspeakers, at an appropriate loudness, can impact the accelerometer, leaking
sensitive information about the speech. In particular, we show that by
exploiting the affected accelerometer readings and carefully selecting feature
sets along with off-the-shelf machine learning techniques, Spearphone can
successfully perform gender classification (accuracy over 90%) and speaker
identification (accuracy over 80%) for any audio/video playback on the
smartphone. Our results with testing the attack on a voice call and voice
assistant response were also encouraging, showcasing the impact of the proposed
attack. In addition, we perform speech recognition and speech reconstruction to
extract more information about the eavesdropped speech to an extent. Our work
brings to light a fundamental design vulnerability in many currently-deployed
smartphones, which may put people's speech privacy at risk while using the
smartphone in the loudspeaker mode during phone calls, media playback or voice
assistant interactions.