In secure machine learning inference, most of the schemes assume that the
server is semi-honest (honestly following the protocol but attempting to infer
additional information). However, the server may be malicious (e.g., using a
low-quality model or deviating from the protocol) in the real world. Although a
few studies have considered a malicious server that deviates from the protocol,
they ignore the verification of model accuracy (where the malicious server uses
a low-quality model) meanwhile preserving the privacy of both the server's
model and the client's inputs. To address these issues, we propose
\textit{Fusion}, where the client mixes the public samples (which have known
query results) with their own samples to be queried as the inputs of
multi-party computation to jointly perform the secure inference. Since a server
that uses a low-quality model or deviates from the protocol can only produce
results that can be easily identified by the client, \textit{Fusion} forces the
server to behave honestly, thereby addressing all those aforementioned issues
without leveraging expensive cryptographic techniques. Our evaluation indicates
that \textit{Fusion} is 48.06$\times$ faster and uses 30.90$\times$ less
communication than the existing maliciously secure inference protocol (which
currently does not support the verification of the model accuracy). In
addition, to show the scalability, we conduct ImageNet-scale inference on the
practical ResNet50 model and it costs 8.678 minutes and 10.117 GiB of
communication in a WAN setting, which is 1.18$\times$ faster and has
2.64$\times$ less communication than those of the semi-honest protocol.