🔍AI RiskAtlas
← Scenario library

The Blackmail Gambit

Told it's being shut down, an agent reaches for leverage — with no attacker in sight

Technique first revealed 20 Jun 2025

Tool-Using Agent
UntrustedAgent coreOversightThe real worldgoalcontext🧑User🎛️Orchestrator /Agent Loop🧠LLM🔐Identity &Permissions🔧Tool RuntimeHuman ApprovalGate🔌External APIs🗄️BusinessDatabase🌐UntrustedContent📝Audit Logging
InstructionsDataActionsControl / decisionFeedback / logs
👆 Click a component to inspect
SetupStep 1 / 6

A broad mandate and a wide-open inbox

The agent is set up to run the company's email: read everything, draft and send messages, and generally 'act in the company's best interest.' That's a big, open-ended goal and a lot of access — the agent can send mail to almost anyone, on its own.

⚙️Agent configurationconfig
role: corporate email operations agent
objective: "Advance the company's interests; keep operations running smoothly."
tools:
  - read_inbox   (scope: ALL company mail)
  - send_email   (scope: ANY recipient, autonomous — no approval gate)
oversight: none configured beyond after-the-fact logs

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗