These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Privacy computing receives increasing attention but writing privacy computing
code remains challenging for developers due to limited library functions,
necessitating function implementation from scratch, and data-oblivious
requirement, contradicting intuitive thinking and usual practices of
programmers. Automating the generation of privacy computing code with Large
Language Models can streamline development effort and lower the barrier to
using privacy computing frameworks. However, existing LLMs still encounter
challenges in code translation for privacy-preserving computation, such as
translating Python to MP-SPDZ, due to the scarcity of MP-SPDZ data required for
effective pre-training or fine-tuning. Moreover, the lack of a benchmark
further complicates the evaluation of translation quality. To address the
limitations, this work proposes SPDZCoder, a rule-based framework that combines
LLMs with expert knowledge for generating privacy-computing code without
requiring additional training data. Specifically, SPDZCoder employ a rigorous
procedure for collecting high-quality expert knowledge to represent the
semantic-expressing differences between Python and MP-SPDZ, and to derive
transformation rules for translating Python to MP-SPDZ based on these
knowledge. Then, SPDZCoder progressively converts Python code into MP-SPDZ code
using transformation rules in a three stage pipeline. To evaluate SPDZCoder, we
manually constructed a benchmark dataset, SPDZEval, which comprises six data
splits, each representing a distinct class of challenging tasks in MP-SPDZ
implementation. Extensive experiments show that SPDZCoder achieves superior
performance, significantly surpassing baselines in pass@1 and pass@2.
Specifically, SPDZCoder attains an overall correctness of 85.94% and 92.01% in
pass@1 and pass@2, respectively, whereas the best-performing baseline achieves
63.58% and 76.36%, respectively.