AIセキュリティポータル K Program
Towards Next-Generation Steganalysis: LLMs Unleash the Power of Detecting Steganography
Share
Abstract
Linguistic steganography provides convenient implementation to hide messages, particularly with the emergence of AI generation technology. The potential abuse of this technology raises security concerns within societies, calling for powerful linguistic steganalysis to detect carrier containing steganographic messages. Existing methods are limited to finding distribution differences between steganographic texts and normal texts from the aspect of symbolic statistics. However, the distribution differences of both kinds of texts are hard to build precisely, which heavily hurts the detection ability of the existing methods in realistic scenarios. To seek a feasible way to construct practical steganalysis in real world, this paper propose to employ human-like text processing abilities of large language models (LLMs) to realize the difference from the aspect of human perception, addition to traditional statistic aspect. Specifically, we systematically investigate the performance of LLMs in this task by modeling it as a generative paradigm, instead of traditional classification paradigm. Extensive experiment results reveal that generative LLMs exhibit significant advantages in linguistic steganalysis and demonstrate performance trends distinct from traditional approaches. Results also reveal that LLMs outperform existing baselines by a wide margin, and the domain-agnostic ability of LLMs makes it possible to train a generic steganalysis model (Both codes and trained models are openly available in https://github.com/ba0z1/Linguistic-Steganalysis-with-LLMs).
Efficient steganography in jpeg images by minimizing performance of optimal detector
R. Cogranne, Q. Giboulot, P. Bas
Published: 2022
Adaptive batch size image merging steganography and quantized gaussian image steganography
M. Sharifzadeh, M. Aloraini, D. Schonfeld
Published: 2020
Side-informed steganography for jpeg images by modeling decompressed images
J. Butora, P. Bas
Published: 2023
Ahcm: Adaptive huffman code mapping for audio steganography based on psychoacoustic model
X. Yi, K. Yang, X. Zhao, Y. Wang, H. Yu
Published: 2019
Audio steganography based on iterative adversarial attacks against convolutional neural networks
J. Wu, B. Chen, W. Luo, Y. Fang
Published: 2020
Rnn-stega: Linguistic steganography based on recurrent neural networks
Zhong-Liang Yang, Xiao-Qing Guo, Zi-Ming Chen, Yong-Feng Huang, Yu-Jin Zhang
Published: 2019
Vae-stega: Linguistic steganography based on variational auto-encoder
Zhong-Liang Yang, Si-Yu Zhang, Yu-Ting Hu, Zhi-Wen Hu, Yong-Feng Huang
Published: 2021
Provably secure generative linguistic steganography
S. Zhang, Z. Yang, J. Yang, Y. Huang
Published: 2021
Perfectly Secure Steganography Using Minimum Entropy Coupling
Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier
Published: 10.25.2022
Discop: Provably secure steganography in practice based on “distribution copies”
Jinyang Ding, Kejiang Chen, Yaofei Wang, Na Zhao, Weiming Zhang, Nenghai Yu
Published: 2023
Semantic-preserving linguistic steganography by pivot translation and semantic-aware bins coding
Tianyu Yang, Hanzhou Wu, Biao Yi, Guorui Feng, Xinpeng Zhang
Published: 2024
Generating steganographic text with lstms
T. Fang, M. Jaggi, K. Argyraki
Published: 2017
Trends in steganography
E. Zielinska, W. Mazurczyk, K. Szczypiorski
Published: 2014
Information hiding: challenges for forensic experts
W. Mazurczyk, S. Wendzel
Published: 2017
Terror groups hide behind web encryption.
J. Kelley
Published: 2001
Bin laden exploits technology to suit his needs.
D. Sieberg
Published: 2001
With cryptography easier to detect, cybercriminals now hide malware in plain sight. call it steganography. here’s how it works.
L. M. CAMERON
Published: 2018
Steganalysis against substitution-based linguistic steganography based on context clusters
Z. Chen, L. Huang, H. Miao, W. Yang, P. Meng
Published: 2011
Linguistic steganalysis using the features derived from synonym frequency
L. Xiang, X. Sun, G. Luo, B. Xia
Published: 2014
A fast and efficient text steganalysis method
Z. Yang, Y. Huang, Y.-J. Zhang
Published: 2019
Ts-csw: Text steganalysis and hidden capacity estimation based on convolutional sliding windows
Z. Yang, Y. Huang, Y. Zhang
Published: 2020
Ts-rnn: text steganalysis based on recurrent neural networks
Z. Yang, K. Wang, J. Li, Y. Huang, Y.-J. Zhang
Published: 2019
Linguistic steganalysis with graph neural networks
H. Wu, B. Yi, F. Ding, G. Feng, X. Zhang
Published: 2021
Detection of generative linguistic steganography based on explicit and latent text word relation mining using deep learning
S. Li, J. Wang, P. Liu
Published: 2022
Linguistic steganalysis merging semantic and statistical features
S. Guo, J. Liu, Z. Yang, W. You, R. Zhang
Published: 2022
An effective linguistic steganalysis framework based on hierarchical mutual learning
Y. Xue, L. Kong, W. Peng, P. Zhong, J. Wen
Published: 2022
Scl-stega: Exploring advanced objective in linguistic steganalysis using contrastive learning
J. Wen, L. Gao, G. Fan, Z. Zhang, J. Jia, Y. Xue
Published: 2023
Text steganalysis based on hierarchical supervised learning and dual attention mechanism
W. Peng, S. Li, Z. Qian, X. Zhang
Published: 2023
Linguistic steganalysis by enhancing and integrating local and global features
Q. Xu, R. Zhang, J. Liu
Published: 2023
High-performance linguistic steganalysis, capacity estimation and steganographic positioning
J. Zou, Z. Yang, S. Zhang, S. u. Rehman, Y. Huang
Published: 2020
Link: Linguistic steganalysis framework with external knowledge
J. Yang, Z. Yang, X. Ge, J. Zou, Y. Gao, Y. Huang
Published: 2023
Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing
Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, Graham Neubig
Published: 2023
Exploration of the effectiveness and characteristics of chatgpt in steganalysis tasks
M. B. et al.
Published: 2023
A hybrid r-bilstmc neural network based text steganalysis
Y. Niu, J. Wen, P. Zhong, Y. Xue
Published: 2019
Linguistic steganalysis via densely connected lstm with feature pyramid
H. Yang, Y. Bao, Z. Yang, S. Liu, Y. Huang, S. Jiao
Published: 2020
Coverless text information hiding method based on the word rank map
J. Zhang, J. Shen, L. Wang, H. Lin
Published: 2016
Coverless text information hiding method using the frequent words hash.
J. Zhang, H. Huang, L. Wang, H. Lin, D. Gao
Published: 2017
Towards linguistic steganography: A systematic investigation of approaches, systems, and issues
R. Bergmair
Published: 2004
Natural language watermarking: Design, analysis, and a proof-of-concept implementation
Mikhail J Atallah, Victor Raskin, Michael Crogan, Christian Hempelmann, Florian Kerschbaum, Dina Mohamed, Sanket Naik
Published: 2001
Language models are few-shot learners
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei
Published: 2020
PaLM: Scaling Language Modeling with Pathways
Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
Published: 2022
Bloom: A 176b-parameter open-access multilingual language model
BigScience Workshop
Published: 2023
Attention is all you need
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin
Published: 2017
Gpt-4 technical report
OpenAI
Published: 2023
Crosslingual generalization through multitask finetuning
N. Muennighoff, T. W. et al.
Published: 2023
Share