These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Security concerns related to Large Language Models (LLMs) have been
extensively explored, yet the safety implications for Multimodal Large Language
Models (MLLMs), particularly in medical contexts (MedMLLMs), remain
insufficiently studied. This paper delves into the underexplored security
vulnerabilities of MedMLLMs, especially when deployed in clinical environments
where the accuracy and relevance of question-and-answer interactions are
critically tested against complex medical challenges. By combining existing
clinical medical data with atypical natural phenomena, we define the mismatched
malicious attack (2M-attack) and introduce its optimized version, known as the
optimized mismatched malicious attack (O2M-attack or 2M-optimization). Using
the voluminous 3MAD dataset that we construct, which covers a wide range of
medical image modalities and harmful medical scenarios, we conduct a
comprehensive analysis and propose the MCM optimization method, which
significantly enhances the attack success rate on MedMLLMs. Evaluations with
this dataset and attack methods, including white-box attacks on LLaVA-Med and
transfer attacks (black-box) on four other SOTA models, indicate that even
MedMLLMs designed with enhanced security features remain vulnerable to security
breaches. Our work underscores the urgent need for a concerted effort to
implement robust security measures and enhance the safety and efficacy of
open-source MedMLLMs, particularly given the potential severity of jailbreak
attacks and other malicious or clinically significant exploits in medical
settings. Our code is available at https://github.com/dirtycomputer/O2M_attack.