Advances in image tampering pose serious security threats, underscoring the
need for effective image manipulation localization (IML). While supervised IML
achieves strong performance, it depends on costly pixel-level annotations.
Existing weakly supervised or training-free alternatives often underperform and
lack interpretability. We propose the In-Context Forensic Chain (ICFC), a
training-free framework that leverages multi-modal large language models
(MLLMs) for interpretable IML tasks. ICFC integrates an objectified rule
construction with adaptive filtering to build a reliable knowledge base and a
multi-step progressive reasoning pipeline that mirrors expert forensic
workflows from coarse proposals to fine-grained forensics results. This design
enables systematic exploitation of MLLM reasoning for image-level
classification, pixel-level localization, and text-level interpretability.
Across multiple benchmarks, ICFC not only surpasses state-of-the-art
training-free methods but also achieves competitive or superior performance
compared to weakly and fully supervised approaches.