AIセキュリティポータル K Program
Obsidian: Cooperative State-Space Exploration for Performant Inference on Secure ML Accelerators
Share
Abstract
Trusted execution environments (TEEs) for machine learning accelerators are indispensable in secure and efficient ML inference. Optimizing workloads through state-space exploration for the accelerator architectures improves performance and energy consumption. However, such explorations are expensive and slow due to the large search space. Current research has to use fast analytical models that forego critical hardware details and cross-layer opportunities unique to the hardware security primitives. While cycle-accurate models can theoretically reach better designs, their high runtime cost restricts them to a smaller state space. We present Obsidian, an optimization framework for finding the optimal mapping from ML kernels to a secure ML accelerator. Obsidian addresses the above challenge by exploring the state space using analytical and cycle-accurate models cooperatively. The two main exploration components include: (1) A secure accelerator analytical model, that includes the effect of secure hardware while traversing the large mapping state space and produce the best m model mappings; (2) A compiler profiling step on a cycle-accurate model, that captures runtime bottlenecks to further improve execution runtime, energy and resource utilization and find the optimal model mapping. We compare our results to a baseline secure accelerator, comprising of the state-of-the-art security schemes obtained from guardnn [ 33 ] and sesame [11]. The analytical model reduces the inference latency by 20.5% for a cloud and 8.4% for an edge deployment with an energy improvement of 24% and 19% respectively. The cycle-accurate model, further reduces the latency by 9.1% for a cloud and 12.2% for an edge with an energy improvement of 13.8% and 13.1%.
A multi-neural network acceleration architecture
Eunjin Baek, Dongup Kwon, Jangwoo Kim
Published: 2020
Triton: Software-defined threat model for secure multi-tenant ml inference accelerators
Sarbartha Banerjee, Shijia Wei, Prakash Ramrakhyani, Mohit Tiwari
Published: 2023
Energy-efficient protocols and hardware architectures for transport layer security
Utsav Banerjee
Published: 2017
An energy-efficient reconfigurable dtls cryptographic engine for securing internet-of-things applications
Utsav Banerjee, Andrew Wright, Chiraag Juvekar, Madeleine Waller, Anantha P Chandrakasan, et al.
Published: 2019
Deep learning with non-medical training used for chest pathology identification
Yaniv Bar, Idit Diamant, Lior Wolf, Hayit Greenspan
Published: 2015
Chest pathology detection using deep learning with non-medical training
Yaniv Bar, Idit Diamant, Lior Wolf, Sivan Lieberman, Eli Konen, Hayit Greenspan
Published: 2015
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
Side channel attacks for architecture extraction of neural networks
Hervé Chabanne, Jean-Luc Danger, Linda Guiga, Ulrich Kühne
Published: 2021
TVM: An automated End-to-End optimizing compiler for deep learning
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Published: 2018
Learning to optimize tensor programs
Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
Published: 2018
Eyeriss v2: A flexible accelerator for emerging deep neural networks on mobile devices
Yu-Hsin Chen, Tien-Ju Yang, Joel Emer, Vivienne Sze
Published: 2019
Prema: A predictive multi-task scheduling algorithm for preemptible neural processing units
Yujeong Choi, Minsoo Rhu
Published: 2020
Dagguise: mitigating memory timing side channels
Peter W Deutsch, Yuheng Yang, Thomas Bourgeat, Jules Drean, Joel S Emer, Mengjia Yan
Published: 2022
Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks
Soroush Ghodrati, Byung Hoon Ahn, Joon Kyung Kim, Sean Kinzer, Brahmendra Reddy Yatham, Navateja Alla, Hardik Sharma, Mohammad Alian, Eiman Ebrahimi, Nam Sung Kim, et al.
Published: 2020
Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy
Ran Gilad-Bachrach, Nathan Dowlin, Kim Laine, Kristin Lauter, Michael Naehrig, John Wensing
Published: 2016
Eie: efficient inference engine on compressed deep neural network
Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, William J Dally
Published: 2016
Identity mappings in deep residual networks
K. He, X. Zhang, S. Ren, J. Sun
Published: 2016
Mind mappings: enabling efficient algorithm-accelerator mapping space search
Kartik Hegde, Po-An Tsai, Sitao Huang, Vikas Chandra, Angshuman Parashar, Christopher W Fletcher
Published: 2021
Guardnn: secure accelerator architecture for privacy-preserving deep learning
Weizhe Hua, Muhammad Umar, Zhiru Zhang, G Edward Suh
Published: 2022
Mgx: Near-zero overhead memory protection for data-intensive accelerators
Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh
Published: 2022
Reverse engineering convolutional neural networks through side-channel information leaks
W. Hua, Z. Zhang, G. E. Suh
Published: 2018
Cosa: Scheduling by constrained optimization for spatial accelerators
Qijing Huang, Minwoo Kang, Grace Dinh, Thomas Norell, Aravind Kalaiah, James Demmel, John Wawrzynek, Yakun Sophia Shao
Published: 2021
Gamma: Automating the hw mapping of dnn models on accelerators via genetic algorithm
Sheng-Chun Kao, Tushar Krishna
Published: 2020
Magma: An optimization framework for mapping multiple dnns on multiple accelerator cores
Sheng-Chun Kao, Tushar Krishna
Published: 2022
Digamma: Domain-aware genetic algorithm for hw-mapping co-optimization for dnn accelerators
Sheng-Chun Kao, Michael Pellauer, Angshuman Parashar, Tushar Krishna
Published: 2022
Ramulator: A fast and extensible dram simulator
Yoongu Kim, Weikun Yang, Onur Mutlu
Published: 2016
Imagenet classification with deep convolutional neural networks
A. Krizhevsky, I. Sutskever, G. E. Hinton
Published: 2017
Understanding reuse, performance, and hardware cost of dnn dataflow: A data-centric approach
Hyoukjun Kwon, Prasanth Chatarasi, Michael Pellauer, Angshuman Parashar, Vivek Sarkar, Tushar Krishna
Published: 2019
Maestro: A data-centric approach to understand reuse, performance, and hardware cost of dnn mappings
Hyoukjun Kwon, Prasanth Chatarasi, Vivek Sarkar, Tushar Krishna, Michael Pellauer, Angshuman Parashar
Published: 2020
Maeri: Enabling flexible dataflow mapping over dnn accelerators via reconfigurable interconnects
Hyoukjun Kwon, Ananda Samajdar, Tushar Krishna
Published: 2018
Secureloop: Design space exploration of secure dnn accelerators
Kyungmi Lee, Mengjia Yan, Joel S Emer, Anantha P Chandrakasan
Published: 2023
Tnpu: Supporting trusted execution with tree-less integrity protection for neural processing unit
Sunho Lee, Jungwoo Kim, Seonjin Na, Jongse Park, Jaehyuk Huh
Published: 2022
Cracking classifiers for evasion: a case study on the google’s phishing pages filter
Bin Liang, Miaoqiang Su, Wei You, Wenchang Shi, Gang Yang
Published: 2016
A systolic array for rapid string comparison
Richard J Lipton, Daniel Lopresti
Published: 1985
Deep learning and convolutional neural networks for medical image computing
Le Lu, Yefeng Zheng, Gustavo Carneiro, Lin Yang
Published: 2017
Timeloop: A systematic approach to dnn accelerator evaluation
Angshuman Parashar, Priyanka Raina, Yakun Sophia Shao, Yu-Hsin Chen, Victor A Ying, Anurag Mukkara, Rangharajan Venkatesan, Brucek Khailany, Stephen W Keckler, Joel Emer
Published: 2019
Faster r-cnn: Towards real-time object detection with region proposal networks
Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun
Published: 2015
A systematic methodology for characterizing scalability of dnn accelerators using scale-sim
Ananda Samajdar, Jan Moritz Joseph, Yuhao Zhu, Paul Whatmough, Matthew Mattina, Tushar Krishna
Published: 2020
Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition
M. Sharif, S. Bhagavatula, L. Bauer, M. K. Reiter
Published: 2016
Securator: A fast and secure neural processing unit
Nivedita Shrivastava, Smruti Ranjan Sarangi
Published: 2023
Mastering the game of Go with deep neural networks and tree search
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot
Published: 2016
Decoupled access/execute computer architectures
James E Smith
Published: 1982
Attention is all you need
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, Illia Polosukhin
Published: 2017
Confidential computing within an ai accelerator
Kapil Vaswani, Stavros Volos, Cédric Fournet, Antonio Nino Diaz, Ken Gordon, Balaji Vembu, Sam Webster, David Chisnall, Saurabh Kulkarni, Graham Cunningham, Richard Osborne, Daniel Wilkinson
Published: 2023
Blackbox attacks on sequential recommenders via data-free model extraction
Zhenrui Yue, Zhankui He, Huimin Zeng, Julian McAuley
Published: 2021
Shef: Shielded enclaves for cloud fpgas
Mark Zhao, Mingyu Gao, Christos Kozyrakis
Published: 2022
Flextensor: An automatic schedule exploration and optimization framework for tensor computation on heterogeneous system
Size Zheng, Yun Liang, Shuo Wang, Renze Chen, Kaiwen Sheng
Published: 2020
Camouflage: Memory traffic shaping to mitigate timing attacks
Yanqi Zhou, Sameer Wagh, Prateek Mittal, David Wentzlaff
Published: 2017
Share