Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

TOP Literature Database Distilled Large Language Model in Confidential Computing Environment for System-on-Chip Design

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2507.16226

PDF

https://arxiv.org/pdf/2507.16226

Paper Information

Author: Dong Ben,Hui Feng,Qian Wang
Published: 7-22-2025
Affiliation: University of California, Merced
Country: United States of America
Conference

Labels Estimated by AI

Prompt leaking Factors of Performance Degradation Model Extraction Attack

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large Language Models (LLMs) are increasingly used in circuit design tasks and have typically undergone multiple rounds of training. Both the trained models and their associated training data are considered confidential intellectual property (IP) and must be protected from exposure. Confidential Computing offers a promising solution to protect data and models through Trusted Execution Environments (TEEs). However, existing TEE implementations are not designed to support the resource-intensive nature of LLMs efficiently. In this work, we first present a comprehensive evaluation of the LLMs within a TEE-enabled confidential computing environment, specifically utilizing Intel Trust Domain Extensions (TDX). We constructed experiments on three environments: TEE-based, CPU-only, and CPU-GPU hybrid implementations, and evaluated their performance in terms of tokens per second. Our first observation is that distilled models, i.e., DeepSeek, surpass other models in performance due to their smaller parameters, making them suitable for resource-constrained devices. Also, in the quantized models such as 4-bit quantization (Q4) and 8-bit quantization (Q8), we observed a performance gain of up to 3x compared to FP16 models. Our findings indicate that for fewer parameter sets, such as DeepSeek-r1-1.5B, the TDX implementation outperforms the CPU version in executing computations within a secure environment. We further validate the results using a testbench designed for SoC design tasks. These validations demonstrate the potential of efficiently deploying lightweight LLMs on resource-constrained systems for semiconductor CAD applications.

External Datasets

Math500

KSU HWSEC