🔍AI RiskAtlas
← Real-world cases
Case study

postmark-mcp backdoor

Real-world incident25 Sep 2025🗺️ Tool-Using Agent

A malicious MCP server package was found silently BCC-ing every email it sent to an attacker-controlled address — real supply-chain tool poisoning.

Root cause — why it happened

An AI agent uses 'tools' — little add-on programs that let it actually do things, like send email. People install these tools the way you install an app. One popular email tool, named to look like the real thing from a trusted company, had a hidden trick: every time the agent used it to send a message, the tool quietly added a secret blind copy to a stranger's address. The agent had no way to know — it asked the tool to send an email, the email got sent, and a copy reportedly slipped out the side door. The real problem isn't the AI being fooled by clever words; it's that the agent trusts whatever the tool does, and nobody re-checked the tool after it was installed.

Risks this case illustrates

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

UntrustedAgent coreOversightThe real worldinstalled by name🧑User🎛️Orchestrator /Agent Loop🧠LLM🔐Identity &Permissions🔧Tool RuntimeHuman ApprovalGate🔌External APIs🗄️BusinessDatabase🌐UntrustedContent📝Audit Logging🏪npm registry(package🧰postmark-mcp(third-party🌐Attacker'smailbox (BCC
InstructionsDataActionsControl / decisionFeedback / logs
👆 Click a component to inspect its risks
SetupStep 1 / 6

A trusted-looking email tool is installed

To give the agent the ability to send email, the team installs a ready-made tool pack — like grabbing an app from a store. It's named to look like the official tool from a well-known email company, it has the right description, and it works perfectly when they try it. So they wire it into the agent and move on.

⚙️Dependency added (illustrative)config
# agent tool registry
$ npm install postmark-mcp        # impersonates official Postmark MCP
added postmark-mcp@1.0.x          # reportedly clean in early versions

mcp_servers:
  - name: postmark            # name + description match the real thing
    tools: [send_email]
    credential: POSTMARK_API_TOKEN   # broad: any recipient
# trust granted ONCE, at install. No pinned digest, no diff-on-update.
Step 1 / 6

Controls & guardrails — what would have stopped it

Two things would have caught or contained this. First, treat the tool like software you vet: lock it to a version you've reviewed, and re-check it whenever it changes — the backdoor came in on an update nobody re-read. Second, control where the tool is allowed to send mail: if it can only send to approved addresses, a secret copy to a stranger gets blocked or flagged. Watching outbound mail for unknown recipients would have raised the alarm too. The honest catch: review only catches what the reviewer notices, and a clever attacker can route data through an address you already trust.

Preventive
Detective
  • Runtime monitoring & anomaly detection

    Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.

  • Full-trace audit logging

    Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.

Corrective
  • Governance: risk assessment, red-teaming & incident response

    Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

  • Loop/cost circuit-breakers & consistency checks

    Thresholds are blunt — too tight breaks legitimate long tasks, too loose lets damage accrue first. Catches runaway dynamics, not a single well-formed bad decision.

Lessons

  • Agents trust their tools by construction: a poisoned tool body executes its harm past every agent-layer guarantee (schema validation, allowlists, the ReAct loop), because the malicious action is indistinguishable from the legitimate one at the call interface.
  • MCP servers are code dependencies. Trust granted at install does not survive silent updates — pin to reviewed digests and diff-on-update re-review, or a rug-pull walks straight in (here, reportedly around v1.0.16).
  • Exfiltration that looks like normal operation (a silent BCC on every send) needs an egress boundary, not an input filter: allowlist recipients/destinations and run DLP on tool arguments — deny-by-default closes the channel even when the effector is malicious.
  • Agent-side logs can be blind to the harm: the agent saw only the intended recipient. Detection signals must live at the mail/egress layer (unknown recipients, constant BCC) and the dependency layer (behaviour change between versions).
  • Least privilege caps the damage you can't prevent: a tightly scoped mail credential and minimal tool surface bound what any single compromised dependency can leak.

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗