Generating tabular data under differential privacy (DP) protection ensures
theoretical privacy guarantees but poses challenges for training machine
learning models, primarily due to the need to capture complex structures under
noisy supervision signals. Recently, pre-trained Large Language Models (LLMs)
-- even those at the scale of GPT-2 -- have demonstrated great potential in
synthesizing tabular data. However, their applications under DP constraints
remain largely unexplored. In this work, we address this gap by applying DP
techniques to the generation of synthetic tabular data. Our findings shows that
LLMs face difficulties in generating coherent text when fine-tuned with DP, as
privacy budgets are inefficiently allocated to non-private elements like table
structures. To overcome this, we propose DP-2Stage, a two-stage fine-tuning
framework for differentially private tabular data generation. The first stage
involves non-private fine-tuning on a pseudo dataset, followed by DP
fine-tuning on a private dataset. Our empirical results show that this approach
improves performance across various settings and metrics compared to directly
fine-tuned LLMs in DP contexts. We release our code and setup at
https://github.com/tejuafonja/DP-2Stage.