In the rapidly evolving landscape of Large Language Models (LLMs), ensuring
robust safety measures is paramount. To meet this crucial need, we propose
\emph{SALAD-Bench}, a safety benchmark specifically designed for evaluating
LLMs, attack, and defense methods. Distinguished by its breadth, SALAD-Bench
transcends conventional benchmarks through its large scale, rich diversity,
intricate taxonomy spanning three levels, and versatile
functionalities.SALAD-Bench is crafted with a meticulous array of questions,
from standard queries to complex ones enriched with attack, defense
modifications and multiple-choice. To effectively manage the inherent
complexity, we introduce an innovative evaluators: the LLM-based MD-Judge for
QA pairs with a particular focus on attack-enhanced queries, ensuring a
seamless, and reliable evaluation. Above components extend SALAD-Bench from
standard LLM safety evaluation to both LLM attack and defense methods
evaluation, ensuring the joint-purpose utility. Our extensive experiments shed
light on the resilience of LLMs against emerging threats and the efficacy of
contemporary defense tactics. Data and evaluator are released under
https://github.com/OpenSafetyLab/SALAD-BENCH.