VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination

TOP Literature Database VeriContaminated: Assessing LLM-Driven Verilog Coding for Data Contamination

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2503.13572

PDF

https://arxiv.org/pdf/2503.13572

Paper Information

Author: Zeng Wang,Minghao Shao,Jitendra Bhandari,Likhitha Mankali,Ramesh Karri,Ozgur Sinanoglu,Muhammad Shafique,Johann Knechtel
Published: 3-17-2025
Updated: 6-12-2025
Affiliation: NYU Tandon School of Engineering
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Prompt leaking FPGA Program Interpretation Graph

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Large Language Models (LLMs) have revolutionized code generation, achieving exceptional results on various established benchmarking frameworks. However, concerns about data contamination - where benchmark data inadvertently leaks into pre-training or fine-tuning datasets - raise questions about the validity of these evaluations. While this issue is known, limiting the industrial adoption of LLM-driven software engineering, hardware coding has received little to no attention regarding these risks. For the first time, we analyze state-of-the-art (SOTA) evaluation frameworks for Verilog code generation (VerilogEval and RTLLM), using established methods for contamination detection (CCD and Min-K% Prob). We cover SOTA commercial and open-source LLMs (CodeGen2.5, Minitron 4b, Mistral 7b, phi-4 mini, LLaMA-{1,2,3.1}, GPT-{2,3.5,4o}, Deepseek-Coder, and CodeQwen 1.5), in baseline and fine-tuned models (RTLCoder and Verigen). Our study confirms that data contamination is a critical concern. We explore mitigations and the resulting trade-offs for code quality vs fairness (i.e., reducing contamination toward unbiased benchmarking).

External Datasets

RTLCoder

Verigen