Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios

TOP Literature Database Self-adaptive Dataset Construction for Real-World Multimodal Safety Scenarios

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2509.04403

PDF

https://arxiv.org/pdf/2509.04403

Paper Information

Author: Jingen Qu,Lijun Li,Bo Zhang,Yichen Yan,Jing Shao
Published: 9-5-2025
Affiliation: Tongji University
Country: China
Conference

Labels Estimated by AI

Prompt Injection Risk Analysis Method 安全性評価手法(Fail to translate)

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Multimodal large language models (MLLMs) are rapidly evolving, presenting increasingly complex safety challenges. However, current dataset construction methods, which are risk-oriented, fail to cover the growing complexity of real-world multimodal safety scenarios (RMS). And due to the lack of a unified evaluation metric, their overall effectiveness remains unproven. This paper introduces a novel image-oriented self-adaptive dataset construction method for RMS, which starts with images and end constructing paired text and guidance responses. Using the image-oriented method, we automatically generate an RMS dataset comprising 35k image-text pairs with guidance responses. Additionally, we introduce a standardized safety dataset evaluation metric: fine-tuning a safety judge model and evaluating its capabilities on other safety datasets.Extensive experiments on various tasks demonstrate the effectiveness of the proposed image-oriented pipeline. The results confirm the scalability and effectiveness of the image-oriented approach, offering a new perspective for the construction of real-world multimodal safety datasets.