Just shipped: Promptfoo can chain jailbreaks together to generate novel attacks
Just like traditional exploits, LLM jailbreaks can be combined to create more potent attacks. Here's an example. By combining techniques like encoding, formatting, fictional dialogue, and an affirmative prefix, the model tells us how to synthesize drugs. This chain bypasses safety controls that would have caught each technique individually. Sometimes the whole is greater than the sum of its parts. Completely coincidentally... Promptfoo now generates "composite" jailbreaks like this one, so you can test these chains and see how vulns interact.