The inherent risk of generating harmful and unsafe content by Large Language
Models (LLMs), has highlighted the need for their safety alignment. Various
techniques like supervised fine-tuning, reinforcement learning from human
feedback, and red-teaming were developed for ensuring the safety alignment of
LLMs. However, the robustness of these aligned LLMs is always challenged by
adversarial attacks that exploit unexplored and underlying vulnerabilities of
the safety alignment. In this paper, we develop a novel black-box jailbreak
attack, called BitBypass, that leverages hyphen-separated bitstream camouflage
for jailbreaking aligned LLMs. This represents a new direction in jailbreaking
by exploiting fundamental information representation of data as continuous
bits, rather than leveraging prompt engineering or adversarial manipulations.
Our evaluation of five state-of-the-art LLMs, namely GPT-4o, Gemini 1.5, Claude
3.5, Llama 3.1, and Mixtral, in adversarial perspective, revealed the
capabilities of BitBypass in bypassing their safety alignment and tricking them
into generating harmful and unsafe content. Further, we observed that BitBypass
outperforms several state-of-the-art jailbreak attacks in terms of stealthiness
and attack success. Overall, these results highlights the effectiveness and
efficiency of BitBypass in jailbreaking these state-of-the-art LLMs.