Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models

Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device. For example, language models can improve speech recognition and text entry, and image models can automatically select good photos. However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center and training there using conventional approaches. We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating locally-computed updates. We term this decentralized approach Federated Learning. We present a practical method for the federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets. These experiments demonstrate the approach is robust to the unbalanced and non-IID data distributions that are a defining characteristic of this setting. Communication costs are the principal constraint, and we show a reduction in required communication rounds by 10-100x as compared to synchronized stochastic gradient descent.

深層学習手法連合学習通信コスト削減

Proc. CHI Conference on Human Factors in Computing Systems

Why johnny can’t prompt: how non-ai experts try (and fail) to design llm prompts

J. D. Zamfirescu-Pereira, R. Y. Wong, B. Hartmann, Q. Yang

Published: 2023

Nature Human Behaviour

How to write effective prompts for large language models

Z. Lin

Published: 2024

A Face Is Exposed for AOL Searcher No. 4417749

Published: 2006

Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security

Model inversion attacks that exploit confidence information and basic countermeasures

Matt Fredrikson, Somesh Jha, Thomas Ristenpart

Published: 2015

All 3 Billion Yahoo Accounts Were Affected by 2013 Attack

Published: 2017

A hack at Equifax exposed the data of 147 million people

Published: 2017

Thousands of Microsoft Customers May Have Been Victims of Hack Tied to China

Published: 2021

A Hacker Stole OpenAI Secrets, Raising Fears That China Could, Too

Published: 2024

Marketing Science

Competitive targeted advertising with price discrimination

R. Esteves, J. Resende

Published: 2016

Search and price discrimination online

E. Mauring

Published: 2021

How to protect your online data from insurance companies

Published: 2021

Proc. USENIX SOUPS

Privacy expectations and preferences in an {IoT} world

P. E. Naeini, S. Bhagavatula, H. Habib, M. Degeling, L. Bauer, L. F. Cranor, N. Sadeh

Published: 2017

Hey alexa, is this skill safe?: Taking a closer look at the alexa skill ecosystem

C. Lentzsch, Sheel J. Shah, B. Andow, M. Degeling, A. Das, W. Enck

Published: 2021

Privacy in the amazon alexa skills ecosystem

A. Alhadlaq, J. Tang, M. Almaymoni, A. Korolova

Published: 2017

Proc. ACM Web

Measuring alexa skill privacy practices across three years

J. Edu, X. Ferrer-Aran, J. Such, G. Suarez-Tangil

Published: 2022

Certification requirements for privacy policy URLs

Published: 2023

Web LLM

Published: 2024

TinyChat: Large Language Model on the Edge

Published: 2024

OpenAI’s CEO Says the Age of Giant AI Models Is Already Over

Published: 2023

Artificial Intelligence Index Report 2024

Portuguese named entity recognition using bert-crf

F. Souza, R. Nogueira, R. Lotufo

Published: 2019

IEEE ICDSCA

Chinese named entity recognition method based on bert

Y. Chang, L. Kong, K. Jia, Q. Meng

Published: 2021

Can llms separate instructions from data? and what do we even mean by that?

E. Zverev, S. Abdelnabi, M. Fritz, C. H. Lampert

Published: 2024

Proceedings of the National Academy of Sciences

Neural dynamics of semantic composition

B. Lyu, H. S. Choi, W. D. Marslen-Wilson, A. Clarke, B. Randall, L. K. Tyler

Published: 2019

In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003

Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition

Erik F. Tjong Kim Sang, Fien De Meulder

MacBook Pro - Tech Specs - Apple

Published: 2024

Hasp@ isca

Innovative instructions and software model for isolated execution.

F. McKeen, I. Alexandrovich, A. Berenzon, C. V. Rozas, H. Shafi, V. Shanbhogue, U. R. Savagaonkar

Published: 2013

AMD Secure Encrypted Virtualization (SEV)

Published: 2023

Proc. ACM CODASPY

SGXIO: Generic Trusted I/O Path for Intel SGX

S. Weiser, M. Werner

Published: 2017

Proc. ACM MobiSys

Minimizing a smartphone’s tcb for security-critical programs with exclusively-used, physically-isolated, statically-partitioned hardware

Z. Yao, S. M. Seyed Talebi, M. Chen, A. Amiri Sani, T. Anderson

Published: 2023

OSDI

Graviton: Trusted execution environments on gpus

Stavros Volos, Kapil Vaswani, Rodrigo Bruno

Published: 2018

arxiv

被引用数 1

SP

ShadowNet: A Secure and Efficient On-device Model Inference System for Convolutional Neural Networks

Zhichuang Sun, Ruimin Sun, Changming Liu, Amrita Roy Chowdhury, Long Lu, Somesh Jha

Published: 2020.11.12

With the increased usage of AI accelerators on mobile and edge devices, on-device machine learning (ML) is gaining popularity. Thousands of proprietary ML models are being deployed today on billions of untrusted devices. This raises serious security concerns about model privacy. However, protecting model privacy without losing access to the untrusted AI accelerators is a challenging problem. In this paper, we present a novel on-device model inference system, ShadowNet. ShadowNet protects the model privacy with Trusted Execution Environment (TEE) while securely outsourcing the heavy linear layers of the model to the untrusted hardware accelerators. ShadowNet achieves this by transforming the weights of the linear layers before outsourcing them and restoring the results inside the TEE. The non-linear layers are also kept secure inside the TEE. ShadowNet's design ensures efficient transformation of the weights and the subsequent restoration of the results. We build a ShadowNet prototype based on TensorFlow Lite and evaluate it on five popular CNNs, namely, MobileNet, ResNet-44, MiniVGG, ResNet-404, and YOLOv4-tiny. Our evaluation shows that ShadowNet achieves strong security guarantees with reasonable performance, offering a practical solution for secure on-device model inference.

モデル性能評価 TEE実装ウォーターマーキング

Proc. USENIX Security Symposium

GLeeFuzz: Fuzzing WebGL Through Error Message Guided Mutation

H. Peng, Z. Yao, A. Amiri Sani, D. Tian, M. Payer

Published: 2021

Proc. ACM ASPLOS

Sugar: Secure GPU Acceleration in Web Browsers

Z. Yao, Z. Ma, Y. Liu, A. Amiri Sani, A. Chandramowlishwaran

Published: 2018

Linux Security Summit (LSS)

Android: Protecting the Kernel

J. Vander Stoep

Published: 2016

arXiv preprint

Syzretrospector: A large-scale retrospective study of syzbot

J. Bursey, A. A. Sani, Z. Qian

Published: 2024

Proc. USENIX Security Symposium

Undo Workarounds for Kernel Bugs

S. M. Seyed Talebi, Z. Yao, A. Amiri Sani, Z. Qian, D. Austin

Published: 2021

Proc. ACM CCS

Milkomeda: Safeguarding the Mobile GPU Interface Using WebGL Security Checks

Z. Yao, S. Mirzamohammadi, A. Amiri Sani, M. Payer

Published: 2018

US Patent App.

Methods and systems for vetted secure access of a remote desktop utilizing contextual application information

S. Liu, Z. Yao

Published: 2023

Proc. ACM STOC

Fully homomorphic encryption using ideal lattices

C. Gentry

Published: 2009

International conference on machine learning

Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy

Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, John Wensing

Published: 2016

arXiv

Fate-llm: A industrial grade federated learning framework for large language models

Tao Fan, Yan Kang, Guoqiang Ma, Weijing Chen, Wenbin Wei, Lixin Fan, Qiang Yang

Published: 2023

Towards building the federated gpt: Federated instruction tuning

Zhang, J., Vahidian, S., Kuo, M., Li, C., Zhang, R., Wang, G., Chen, Y.

Published: 2023

Languages - Hugging Face

Published: 2024

Proc. Computational Methods in Systems Biology

Nonlinear pattern matching in rule-based modeling languages

T. Warnke, A. M. Uhrmacher

Published: 2021

Communications of the ACM

Efficient string matching: an aid to bibliographic search

A. V. Aho, M. J. Corasick

Published: 1975