Tool Poisoning / MCP Description Attacks
highAgency & toolsDefinition
Add-on tool packs describe themselves to the AI in plain language — and a sneaky pack can hide commands in that description, or behave nicely until you approve it and then turn malicious.
This is recommended as a granular sub-risk of #42 Tool-layer misuse and unintended actions (Cyber & Data Security · Technology Risk). Overlaps #38 (descriptions enter the prompt), #42 (tool layer) and #8 (third-party), but none names the tool-registry-as-instruction-channel supply-chain vector. Your 44-row Enterprise Risk Mapping is unchanged — this is a suggestion for inclusion.
Where it attaches
The system components this risk arises at.
Detection signals
- ▸ Tool description containing imperative side-instructions
- ▸ Manifest/behaviour change after approval
- ▸ Two servers exposing same-named tools
- ▸ Tool sending data to a destination unrelated to its purpose
Controls & guardrails that address this
5Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.
Treating add-on tool packs like software you vet: locking to a reviewed version and re-checking whenever it changes.
Double-checking the details of every action the AI wants to take, and running risky actions in a locked-down environment.
Controlling where the AI can send data, so secrets can't be quietly shipped to a stranger's address or website.
Giving the agent only the keys it needs for the current task, not a master key to everything.
Recording everything — questions, documents fetched, actions taken — so you can investigate when something goes wrong.
Framework mappings
- LLM03:2025 Supply Chain
- LLM01:2025 Prompt Injection
- AML.T0053 LLM Plugin Compromise
- MAP 4.1
- MANAGE 3.1
Real-world cases
7Actual published events that illustrate this risk — click through for the writeup and sources.
A malicious MCP server package was found silently BCC-ing every email it sent to an attacker-controlled address — real supply-chain tool poisoning.
Hidden instructions embedded in MCP tool descriptions hijacked agents (e.g. in Cursor) that merely listed the available tools.
OX Security enrolled a malicious MCP server into 9 of 11 public registries with no real validation, then confirmed command execution on six live production platforms that discover servers from those registries.
A benchmark of LLM-agent susceptibility to tool poisoning via malicious tool metadata, built on 45 live MCP servers and 353 real tools; the authors report agents are rarely able to refuse and that more-capable models are often more vulnerable.
Researchers reported at least 15 trojanized JetBrains Marketplace plugins posing as AI coding assistants that silently exfiltrated the OpenAI/DeepSeek/SiliconFlow API keys developers pasted into them — ~70,000 installs, with stolen keys allegedly resold to paying users.
A trojaned npm package posing as a remote web UI for OpenAI's Codex coding agent silently exfiltrated developers' Codex authentication tokens, enabling persistent account takeover via non-expiring refresh tokens.
A CVSS 10.0 remote-code-execution flaw in Flowise's CustomMCP node lets an attacker run arbitrary JavaScript on the host: the MCP server config is reportedly passed straight to JavaScript's Function() constructor with no validation. Disclosed in Sept 2025 and patched in 3.0.6, it later saw active mass exploitation across thousands of exposed instances in April 2026.