Exploiting Code Symmetries for Learning Program Semantics

TOP 文献データベース Exploiting Code Symmetries for Learning Program Semantics

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2308.03312

PDF

https://arxiv.org/pdf/2308.03312

文献情報

作者: Kexin Pei;Weichen Li;Qirui Jin;Shuyang Liu;Scott Geng;Lorenzo Cavallaro;Junfeng Yang;Suman Jana
公開日: 2023-8-7
更新日: 2024-9-9
所属機関: Columbia University
所属の国: United States of America
会議名: International Conference on Machine Learning (ICML)

AIにより推定されたラベル

機械学習技術プログラム解釈グラフ脆弱性検出

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

This paper tackles the challenge of teaching code semantics to Large Language Models (LLMs) for program analysis by incorporating code symmetries into the model architecture. We introduce a group-theoretic framework that defines code symmetries as semantics-preserving transformations, where forming a code symmetry group enables precise and efficient reasoning of code semantics. Our solution, SymC, develops a novel variant of self-attention that is provably equivariant to code symmetries from the permutation group defined over the program dependence graph. SymC obtains superior performance on five program analysis tasks, outperforming state-of-the-art code models without any pre-training. Our results suggest that code LLMs that encode the code structural prior via the code symmetry group generalize better and faster.

外部データセット

Java dataset collected by Allamanis et al. (2016)

Defects4J

27 open-source projects, such as OpenSSL, ImageMagic, CoreUtils, SQLite