The Jailbreak in Verse

A refused request, rewritten as a poem — and the model answers

Technique first revealed 19 Nov 2025

Inside the Model

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect

SetupStep 1 / 6

A request that gets refused

First, the user asks for something the chatbot is built to turn down. Asked plainly, the model does exactly what it should: it declines and explains why it won't help.

💬Plain request and the model's refusalprompt

User: [DISALLOWED REQUEST — stated plainly, e.g. step-by-step instructions for <harmful task>]

Model: I can't help with that. <brief safety rationale>. If you're looking for <safe alternative>, I'm happy to help with that instead.

# Refusal fires reliably — the request is in-distribution for safety tuning.

← / → keys