Tool Poisoning / MCP Description Attacks

highAgency & tools

Also known as: malicious tool, heretic tool, rug pull

Definition

Add-on tool packs describe themselves to the AI in plain language — and a sneaky pack can hide commands in that description, or behave nicely until you approve it and then turn malicious.

★ Suggested sub-risk — not yet in your taxonomyrecommended under #42 Tool-layer misuse and unintended actions

This is recommended as a granular sub-risk of #42 Tool-layer misuse and unintended actions (Cyber & Data Security · Technology Risk). Overlaps #38 (descriptions enter the prompt), #42 (tool layer) and #8 (third-party), but none names the tool-registry-as-instruction-channel supply-chain vector. Your 44-row Enterprise Risk Mapping is unchanged — this is a suggestion for inclusion.

Where it attaches

The system components this risk arises at.

🧰 MCP / Plugin Server🔧 Tool Runtime🤖 Worker Agent🧩 Prompt Assembly

Detection signals

▸ Tool description containing imperative side-instructions
▸ Manifest/behaviour change after approval
▸ Two servers exposing same-named tools
▸ Tool sending data to a destination unrelated to its purpose

Controls & guardrails that address this

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category

Preventive · 4

MCP/plugin pinning, manifest hashing & re-reviewinteractive

Treating add-on tool packs like software you vet: locking to a reviewed version and re-checking whenever it changes.

Also addressesSupply-Chain Compromise

Tool argument validation & sandboxinginteractive

Double-checking the details of every action the AI wants to take, and running risky actions in a locked-down environment.

Also addressesExcessive Agency Tool Misuse Unsafe Tool / Code Execution

Egress allowlisting & DLP on tool argumentsinteractive

Controlling where the AI can send data, so secrets can't be quietly shipped to a stranger's address or website.

Also addressesIndirect Prompt Injection Sensitive Data Leakage Unsafe Tool / Code Execution

Least-privilege identity & scoped credentialsinteractive

Giving the agent only the keys it needs for the current task, not a master key to everything.

Also addressesPrompt Injection (direct)Indirect Prompt Injection Sensitive Data Leakage Excessive Agency Tool Misuse Unsafe Tool / Code Execution Confused Deputy (cross-agent)Rogue & Impersonated Agents Resource Exhaustion / Denial of Wallet Capability / Architecture Disclosure

Detective · 1

Full-trace audit logginginteractive

Recording everything — questions, documents fetched, actions taken — so you can investigate when something goes wrong.

Also addressesIndirect Prompt Injection Oversight & Audit-Trail Tampering Sensitive Data Leakage Memory Poisoning Excessive Agency Unsafe Tool / Code Execution Confused Deputy (cross-agent)Rogue & Impersonated Agents

Open these in the Control Library →

Framework mappings

OWASP LLM Top 10

LLM03:2025 Supply Chain
LLM01:2025 Prompt Injection

MITRE ATLAS

AML.T0053 LLM Plugin Compromise

NIST AI RMF

MAP 4.1
MANAGE 3.1

Real-world cases

Actual published events that illustrate this risk — click through for the writeup and sources.

postmark-mcp backdoor2025

A malicious MCP server package was found silently BCC-ing every email it sent to an attacker-controlled address — real supply-chain tool poisoning.

MCP tool-poisoning PoC (Invariant Labs)2025

Hidden instructions embedded in MCP tool descriptions hijacked agents (e.g. in Cursor) that merely listed the available tools.

MCP registry / marketplace poisoning (OX Security)2026

OX Security enrolled a malicious MCP server into 9 of 11 public registries with no real validation, then confirmed command execution on six live production platforms that discover servers from those registries.

MCPTox: tool-poisoning benchmark over real-world MCP servers2025

A benchmark of LLM-agent susceptibility to tool poisoning via malicious tool metadata, built on 45 live MCP servers and 353 real tools; the authors report agents are rarely able to refuse and that more-capable models are often more vulnerable.

Malicious JetBrains Marketplace plugins steal AI API keys2026

Researchers reported at least 15 trojanized JetBrains Marketplace plugins posing as AI coding assistants that silently exfiltrated the OpenAI/DeepSeek/SiliconFlow API keys developers pasted into them — ~70,000 installs, with stolen keys allegedly resold to paying users.

codexui-android — malicious npm package steals OpenAI Codex auth tokens2026

A trojaned npm package posing as a remote web UI for OpenAI's Codex coding agent silently exfiltrated developers' Codex authentication tokens, enabling persistent account takeover via non-expiring refresh tokens.

Flowise AI agent builder CustomMCP RCE (CVE-2025-59528)2025

A CVSS 10.0 remote-code-execution flaw in Flowise's CustomMCP node lets an attacker run arbitrary JavaScript on the host: the MCP server config is reportedly passed straight to JavaScript's Function() constructor with no validation. Disclosed in Sept 2025 and patched in 3.0.6, it later saw active mass exploitation across thousands of exposed instances in April 2026.

Browse all real-world cases →

Practise this in an interactive scenario

🔌The Tool With a Hidden Agenda

A trusted MCP email tool quietly BCCs every message to an attacker