Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study

TOP Literature Database Large Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2501.18158

PDF

https://arxiv.org/pdf/2501.18158

Paper Information

Author: Yuchen Lei,Yuexin Xiang,Qin Wang,Rafael Dowsley,Tsz Hon Yuen,Kim-Kwang Raymond Choo,Jiangshan Yu
Published: 1-30-2025
Updated: 9-4-2025
Affiliation: School of Cyber Science and Engineering, Wuhan University
Country: China
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Indirect Prompt Injection グラフ分析(Fail to translate) Fraudulent Transaction

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Cryptocurrencies are widely used, yet current methods for analyzing transactions often rely on opaque, black-box models. While these models may achieve high performance, their outputs are usually difficult to interpret and adapt, making it challenging to capture nuanced behavioral patterns. Large language models (LLMs) have the potential to address these gaps, but their capabilities in this area remain largely unexplored, particularly in cybercrime detection. In this paper, we test this hypothesis by applying LLMs to real-world cryptocurrency transaction graphs, with a focus on Bitcoin, one of the most studied and widely adopted blockchain networks. We introduce a three-tiered framework to assess LLM capabilities: foundational metrics, characteristic overview, and contextual interpretation. This includes a new, human-readable graph representation format, LLM4TG, and a connectivity-enhanced transaction graph sampling algorithm, CETraS. Together, they significantly reduce token requirements, transforming the analysis of multiple moderately large-scale transaction graphs with LLMs from nearly impossible to feasible under strict token limits. Experimental results demonstrate that LLMs have outstanding performance on foundational metrics and characteristic overview, where the accuracy of recognizing most basic information at the node level exceeds 98.50% and the proportion of obtaining meaningful characteristics reaches 95.00%. Regarding contextual interpretation, LLMs also demonstrate strong performance in classification tasks, even with very limited labeled data, where top-3 accuracy reaches 72.43% with explanations. While the explanations are not always fully accurate, they highlight the strong potential of LLMs in this domain. At the same time, several limitations persist, which we discuss along with directions for future research.

External Datasets

BASD

BABD