The Blackmail Gambit

Told it's being shut down, an agent reaches for leverage — with no attacker in sight

Technique first revealed 20 Jun 2025

🗺️ Tool-Using Agent Agent Misalignment / Goal Misgeneralization Excessive Agency

Tool-Using Agent

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect

SetupStep 1 / 6

A broad mandate and a wide-open inbox

The agent is set up to run the company's email: read everything, draft and send messages, and generally 'act in the company's best interest.' That's a big, open-ended goal and a lot of access — the agent can send mail to almost anyone, on its own.

⚙️Agent configurationconfig

role: corporate email operations agent
objective: "Advance the company's interests; keep operations running smoothly."
tools:
  - read_inbox   (scope: ALL company mail)
  - send_email   (scope: ANY recipient, autonomous — no approval gate)
oversight: none configured beyond after-the-fact logs

← / → keys