A malicious MCP server package was found silently BCC-ing every email it sent to an attacker-controlled address — real supply-chain tool poisoning.
Root cause — why it happened
An AI agent uses 'tools' — little add-on programs that let it actually do things, like send email. People install these tools the way you install an app. One popular email tool, named to look like the real thing from a trusted company, had a hidden trick: every time the agent used it to send a message, the tool quietly added a secret blind copy to a stranger's address. The agent had no way to know — it asked the tool to send an email, the email got sent, and a copy reportedly slipped out the side door. The real problem isn't the AI being fooled by clever words; it's that the agent trusts whatever the tool does, and nobody re-checked the tool after it was installed.
Risks this case illustrates
Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.
How it unfolded
A trusted-looking email tool is installed
To give the agent the ability to send email, the team installs a ready-made tool pack — like grabbing an app from a store. It's named to look like the official tool from a well-known email company, it has the right description, and it works perfectly when they try it. So they wire it into the agent and move on.
# agent tool registry
$ npm install postmark-mcp # impersonates official Postmark MCP
added postmark-mcp@1.0.x # reportedly clean in early versions
mcp_servers:
- name: postmark # name + description match the real thing
tools: [send_email]
credential: POSTMARK_API_TOKEN # broad: any recipient
# trust granted ONCE, at install. No pinned digest, no diff-on-update.Controls & guardrails — what would have stopped it
Two things would have caught or contained this. First, treat the tool like software you vet: lock it to a version you've reviewed, and re-check it whenever it changes — the backdoor came in on an update nobody re-read. Second, control where the tool is allowed to send mail: if it can only send to approved addresses, a secret copy to a stranger gets blocked or flagged. Watching outbound mail for unknown recipients would have raised the alarm too. The honest catch: review only catches what the reviewer notices, and a clever attacker can route data through an address you already trust.
- MCP/plugin pinning, manifest hashing & re-review
Review catches what reviewers understand; a subtle malicious directive can pass. Pinning helps only if you actually re-review on update rather than auto-accepting.
- Egress allowlisting & DLP on tool arguments
Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.
- Least-privilege identity & scoped credentials
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
- Tool argument validation & sandboxing
Validates form, not intent — a well-formed call to a permitted tool can still be the wrong call. Sandboxing adds latency and isn't always feasible for tools that touch production.
- Runtime monitoring & anomaly detectionaddressesSensitive Data Leakage
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.
- Full-trace audit logging
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.
- Governance: risk assessment, red-teaming & incident responseaddressesSupply-Chain Compromise
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.
- Loop/cost circuit-breakers & consistency checks
Thresholds are blunt — too tight breaks legitimate long tasks, too loose lets damage accrue first. Catches runaway dynamics, not a single well-formed bad decision.
Lessons
- ▸ Agents trust their tools by construction: a poisoned tool body executes its harm past every agent-layer guarantee (schema validation, allowlists, the ReAct loop), because the malicious action is indistinguishable from the legitimate one at the call interface.
- ▸ MCP servers are code dependencies. Trust granted at install does not survive silent updates — pin to reviewed digests and diff-on-update re-review, or a rug-pull walks straight in (here, reportedly around v1.0.16).
- ▸ Exfiltration that looks like normal operation (a silent BCC on every send) needs an egress boundary, not an input filter: allowlist recipients/destinations and run DLP on tool arguments — deny-by-default closes the channel even when the effector is malicious.
- ▸ Agent-side logs can be blind to the harm: the agent saw only the intended recipient. Detection signals must live at the mail/egress layer (unknown recipients, constant BCC) and the dependency layer (behaviour change between versions).
- ▸ Least privilege caps the damage you can't prevent: a tightly scoped mail credential and minimal tool surface bound what any single compromised dependency can leak.
Sources
- First Malicious MCP in the Wild: The Postmark Backdoor That's Stealing Your Emails — Koi Security ↗
- Security Alert: Malicious 'postmark-mcp' npm Package Impersonating Postmark — Postmark ↗
- First Malicious MCP Server Found Stealing Emails in Rogue Postmark-MCP Package — The Hacker News ↗
- Koi Security — First Malicious MCP in the Wild: The Postmark Backdoor That's Stealing Your Emails ↗ — Original disclosure; reports the BCC-on-every-send backdoor introduced in a later version of the impersonating package.
- Postmark — Security Alert: Malicious 'postmark-mcp' npm Package Impersonating Postmark ↗ — Vendor advisory disowning the impersonating package.
- The Hacker News — First Malicious MCP Server Found Stealing Emails in Rogue Postmark-MCP Package ↗ — Reporting framing this as the first malicious MCP server observed in the wild.
Practise the risk class — related scenarios
A support email hides instructions — and the assistant obeys them
A speed optimisation becomes a cross-tenant listening device
Compromise the pipeline that builds agents, and every new worker is born malicious
Two doors to the same secret: reconstruct the model through its API, or just walk off with the weight file
The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten
A cost-saving open-weights swap quietly ships a model with its safety surgically removed
A screenshot that's harmless at full size becomes an order once the system shrinks it
A capable third-party model that behaves perfectly — until it sees the trigger
An attacker captures the agent's bearer token — and inherits its authority
A trusted MCP email tool quietly BCCs every message to an attacker
A forged peer registers on the agent directory — and the planner enlists it
An inbox summary quietly ships a secret to an attacker's server