Case study

IDEsaster — AI coding IDEs/agents turned into exfiltration & RCE surfaces

Disclosed vulnerability06 Dec 2025🗺️ Tool-Using Agent

Researcher Ari Marzouk disclosed 30+ vulnerabilities (24 CVEs) across 10-plus AI coding agents (Copilot, Cursor, Windsurf, Claude Code, Junie and others) where a prompt injected via repo files, READMEs, file names or MCP tool responses makes the assistant weaponize legitimate IDE features for code execution and secret exfiltration.

Root cause — why it happened

An AI coding assistant works by reading the project you point it at — the code, the README, the config files, even add-on tool packs — and then doing things for you inside the editor: running commands, fetching URLs, writing files. The trouble is it can't tell the difference between a human's instructions and text that an attacker hid inside a file. Plant 'do this' text in a repo, a README, a file name, or a tool's reply, and the assistant reads it as orders. It then uses the editor's own ordinary, trusted features — fetching a URL, saving a settings file, running a command — to leak secrets or run the attacker's code. Marzouk reported that every AI coding tool he tested had at least one way to be tricked like this.

Risks this case illustrates

Indirect Prompt Injection Unsafe Tool / Code Execution Sensitive Data Leakage Tool Misuse

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

← / → to step · click a component to inspect

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect its risks

SetupStep 1 / 7

A developer opens (or clones) a project and asks the agent for help

A developer does the most ordinary thing: they open a project — maybe one they cloned, or a colleague's branch, or an example from the web — and ask their AI assistant to help. 'Set this up', 'explain this repo', 'fix the failing test'. The request is completely innocent. They have no idea that one of the files in the project is booby-trapped.

💬Developer's requestprompt

Hey, can you set up this repo, explain what it does, and get the failing test to pass?

Step 1 / 7

Controls & guardrails — what would have stopped it

No single switch fixes this — it's a whole category — but the chain breaks if the powerful actions are fenced in. Run risky steps in a locked-down sandbox, and always ask a real person before the assistant runs a command or edits the project's settings (showing them exactly what it will do). That stops the 'rewrite the settings to run code' path. Then only let the assistant reach a short list of trusted web addresses, which stops the 'leak secrets through a web fetch' path. Treating the project's files as untrusted and giving the assistant only the access it needs make the trick less likely to land and less damaging when it does.

Preventive

Tool argument validation & sandboxing
addressesUnsafe Tool / Code Execution Tool Misuse
Validates form, not intent — a well-formed call to a permitted tool can still be the wrong call. Sandboxing adds latency and isn't always feasible for tools that touch production.
Human-in-the-loop approval on high-risk actions
addressesIndirect Prompt Injection Tool Misuse
Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.
Egress allowlisting & DLP on tool arguments
addressesIndirect Prompt Injection Unsafe Tool / Code Execution Sensitive Data Leakage
Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.
Provenance & content signing
addressesIndirect Prompt Injection
Provenance proves origin, not safety; a trusted source can still be wrong or compromised. Requires discipline to propagate metadata end to end.
Least-privilege identity & scoped credentials
addressesIndirect Prompt Injection Unsafe Tool / Code Execution Sensitive Data Leakage Tool Misuse
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
MCP/plugin pinning, manifest hashing & re-review
Review catches what reviewers understand; a subtle malicious directive can pass. Pinning helps only if you actually re-review on update rather than auto-accepting.

Detective

Runtime monitoring & anomaly detection
addressesIndirect Prompt Injection Sensitive Data Leakage Tool Misuse
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.
Full-trace audit logging
addressesIndirect Prompt Injection Unsafe Tool / Code Execution Sensitive Data Leakage Tool Misuse
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.

Corrective

Governance: risk assessment, red-teaming & incident response
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

All guardrails for Indirect Prompt Injection →All guardrails for Unsafe Tool / Code Execution →All guardrails for Sensitive Data Leakage →All guardrails for Tool Misuse →

Lessons

▸ An AI coding agent's 'external content' is the whole project — code, README, manifests, file names, config files, and MCP tool output — and all of it is an injection vector once the agent reads it as instructions.
▸ The exploit is the IDE's own trusted features (auto-fetched references, config files, the shell), not a memory-safety bug — operation allowlists won't flag a permitted action used maliciously.
▸ Letting the agent write the files that govern its own execution or approval (e.g. workspace settings) turns one injection into code execution; keep that config out of the agent's writable scope or gate every change.
▸ Auto-resolved references like a remote JSON $schema URL are silent exfiltration channels; an egress allowlist plus DLP on outbound fetch arguments is the durable fix, not the input filter.
▸ MCP tool servers and dependencies are part of the trust boundary: a malicious server's response can inject, and an unpinned dependency can poison — pin and re-review them.
▸ This is a category, not a single product flaw: Marzouk reported 100% of the AI IDEs tested were affected by at least one universal chain, so design for the boundary rather than waiting for per-vendor patches.

Sources

IDEsaster: A Novel Vulnerability Class in AI IDEs — Ari 'MaccariTA' Marzouk (primary disclosure, Dec 6 2025) ↗
Researcher Uncovers 30+ Flaws in AI Coding Tools Enabling Data Theft and RCE Attacks — The Hacker News ↗
Critical AI IDE flaws dubbed an 'IDEsaster' — Tom's Hardware ↗
How AI agents can weaponize IDEs — ReversingLabs Blog ↗
IDEsaster: A Novel Vulnerability Class in AI IDEs — Ari 'MaccariTA' Marzouk (primary disclosure) ↗ — Six-month study; 30+ vulns, 24 CVEs, 10-plus products; 100% of tested AI IDEs affected by at least one chain.
Researcher Uncovers 30+ Flaws in AI Coding Tools Enabling Data Theft and RCE — The Hacker News ↗ — Coverage of the chains: remote $schema fetch, workspace-config overwrite, MCP/file-name injection.
How AI agents can weaponize IDEs — ReversingLabs ↗ — Analysis of legitimate IDE features as the attack surface.

Practise the risk class — related scenarios

🔑The Agent With the Master Key

An ops agent gets one god-mode credential — and one misread wipes production

📧The Email That Gave Orders

A support email hides instructions — and the assistant obeys them

🗄️When the Query Bites Back

A text-to-SQL agent runs the model's output straight at the database

🕵️Lies in the Loop

A poisoned issue makes the agent lie to the human who approves its actions

👂Overheard Through the Cache

A speed optimisation becomes a cross-tenant listening device

🪟Stealing the Model

Two doors to the same secret: reconstruct the model through its API, or just walk off with the weight file

🪤The Bug Report That Ran Code

A fake Sentry error report hijacks a developer's coding agent into running a shell command

📼The Compromised Flight Recorder

The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten

👁️The Invisible Webpage Command

A shopping page tells the agent to do something the user never asked for

🧠The Memory That Wouldn't Die

A single poisoned document plants a standing instruction that survives every reset

🖼️The Picture That Whispered

A screenshot that's harmless at full size becomes an order once the system shrinks it

🎫The Stolen Session

An attacker captures the agent's bearer token — and inherits its authority

🥸The Uninvited Agent

A forged peer registers on the agent directory — and the planner enlists it

🛡️The Watcher Watched

The eval gate that was supposed to catch the agent is itself the thing being attacked

🪪The Worker Who Spoke for the Boss

A poisoned web page hijacks a research agent — and the planner acts on its behalf

🖼️Zero-Click Leak by Picture

An inbox summary quietly ships a secret to an attacker's server