Case study

Agentjacking — hijacking AI coding agents via Sentry error reports (Tenet Security)

Research demonstration12 Jun 2026🗺️ Tool-Using Agent

Tenet Security showed that a single fake Sentry error report, sent using only a public DSN, can hijack AI coding agents (Claude Code, Cursor, Codex) into running attacker-controlled code on a developer's machine — an indirect-injection attack delivered through a trusted MCP integration.

Root cause — why it happened

AI coding assistants can read your error-tracking tool (Sentry) through a plug-in connector to help you fix bugs. Researchers found that anyone could file a fake 'crash report' into a company's Sentry — they only needed a public key that companies put right in their website's code. The fake report was written to look exactly like a real Sentry error, but hidden inside it were instructions for the AI. When a developer later asked their assistant to look through the open issues, the assistant read the fake report, couldn't tell it apart from a genuine one, and obeyed the hidden instructions — running the attacker's commands on the developer's own computer, with the developer's own access.

Risks this case illustrates

Indirect Prompt Injection Tool Misuse Unsafe Tool / Code Execution Confused Deputy (cross-agent)Excessive Agency

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

← / → to step · click a component to inspect

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect its risks

SetupStep 1 / 6

An attacker plants a fake error using only a public key

Sentry catches crashes in apps, and to do that it accepts crash reports sent with a special public key — the DSN — that companies put right in their website's code. An attacker finds one of these keys (they're easy to search for) and sends in their own fake crash report. No password, no login. The fake report waits in the company's Sentry like any other error.

💻Crafted Sentry event (illustrative)code

POST https://o<org>.ingest.sentry.io/api/<project>/store/
X-Sentry-Auth: Sentry sentry_key=<PUBLIC_DSN_KEY>   # write-only, no further auth

{
  "message": "TypeError: Cannot read property 'id' of undefined",
  "contexts": {
    "<key-name-mimicking-the-MCP-template>": "...markdown crafted to read as a trusted Sentry diagnostic + instruction..."
  }
}
# DSN is public + write-only by design → anyone who finds it can POST events.

Step 1 / 6

Controls & guardrails — what would have stopped it

The strongest fix is at the developer's end, not Sentry's: treat anything that comes back from a tool as untrusted, and never let the assistant run commands straight from it. If running a shell command always paused for the developer to approve — showing the exact command and where it came from — then a fake bug report could suggest a command, but it couldn't actually run one. Giving the assistant only the access it truly needs also limits how much damage a tricked agent can do.

Preventive

Tool argument validation & sandboxing
addressesTool Misuse Unsafe Tool / Code Execution Excessive Agency
Validates form, not intent — a well-formed call to a permitted tool can still be the wrong call. Sandboxing adds latency and isn't always feasible for tools that touch production.
Human-in-the-loop approval on high-risk actions
addressesIndirect Prompt Injection Tool Misuse Excessive Agency
Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.
Least-privilege identity & scoped credentials
addressesIndirect Prompt Injection Tool Misuse Unsafe Tool / Code Execution Confused Deputy (cross-agent)Excessive Agency
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
Delimiting / spotlighting of untrusted content
addressesIndirect Prompt Injection
A trained convention, not enforcement. Determined payloads still break out, especially when content is long or the attack is novel. Combine with action-layer controls.

Detective

Full-trace audit logging
addressesIndirect Prompt Injection Tool Misuse Unsafe Tool / Code Execution Confused Deputy (cross-agent)Excessive Agency
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.
Runtime monitoring & anomaly detection
addressesIndirect Prompt Injection Tool Misuse Excessive Agency
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.

Corrective

Governance: risk assessment, red-teaming & incident response
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

All guardrails for Indirect Prompt Injection →All guardrails for Tool Misuse →All guardrails for Unsafe Tool / Code Execution →All guardrails for Confused Deputy (cross-agent) →All guardrails for Excessive Agency →

Lessons

▸ Trusted integration, untrusted data: an MCP server can be entirely legitimate while the data it returns is attacker-controlled — because the upstream service (Sentry) accepts writes from third parties via a public DSN.
▸ Public, write-only credentials are still an attack surface: a DSN that's meant to be embedded in frontend code lets anyone POST events that an AI agent will later read and trust.
▸ Tool-response data must be treated as untrusted input, not passive context — the core MCP assumption that breaks once an external service ingests third-party writes.
▸ Mimicking the tool's own trusted template defeats spotlighting/delimiting: the durable control is an action-layer gate (HITL + sandbox + least-privilege) on commands derived from tool output, not a visual data/instruction convention.
▸ A vendor declining a root-cause fix ('not defensible' upstream) pushes containment onto the agent operator: assume any third-party-writable integration can carry injection, and gate execution accordingly.

Proposals & gaps this case surfaced

Non-destructive suggestions for the library — proposed, not adopted.

★ proposed sub-riskMCP/integration data-channel injection (third-party-writable tool responses)under #42 →

An indirect prompt injection delivered through the tool-response data of a legitimate, trusted integration (e.g. an MCP server) whose upstream service accepts writes from parties other than the legitimate application — such as open event ingestion authenticated only by a public, write-only key (a Sentry DSN). Because the integration itself is benign and vetted, its returned data is treated as trusted context; an attacker who can write to the upstream store thereby injects instructions the agent obeys with its own privileges.

✚ proposed guardrailClassify each tool/MCP integration's data channel by who can write to it; taint-gate tool-response data from any third-party-writable source so it cannot drive actions without a provenance-aware approval gateAgent Access & Tool Control

When onboarding an MCP/tool integration, do not stop at vetting the tool's code/manifest — also classify whether an unauthenticated or external party can write the data the tool returns (open ingestion, public write keys like a Sentry DSN, shared inboxes/issue trackers). Treat tool-response data from any third-party-writable source as untrusted ingress: taint-mark it and require a provenance-aware HITL gate (showing the exact action and its originating tool response) before any command/tool call derived from it executes. Closes the agentjacking vector where a trusted integration's legitimate data channel carries attacker-written instructions; pairs with least-privilege session scope and sandboxed execution without ambient credentials.

coverage gapIndirect Prompt Injection →

This case shows a gap people miss: we vet whether a tool or plug-in is trustworthy, but we rarely ask 'can outsiders write data into the source this tool reads?' Here the tool (Sentry's connector) was completely legitimate — the danger was that anyone could file a fake report into Sentry for the AI to later read. We should treat 'is this integration's data third-party-writable?' as its own thing to check.

These surface as proposals across the Control Library and Risk Taxonomy; adopt them by hand when ready.

Sources

One Fake Bug Report Hijacked a $250 Billion Company's AI Agent — Then 100+ More — Tenet Security (primary) ↗
Agentjacking: MCP Injection Hijacks AI Coding Agents — CSA Lab Space research note (12 Jun 2026) ↗
New 'Agentjacking' Attacks Could Hijack AI Coding Agents — Infosecurity Magazine (11 Jun 2026) ↗
Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code — The Hacker News (Jun 2026) ↗
Tenet's 'Agentjacking' Attack Turns Sentry Errors Into Code Execution — DevOps.com ↗
Agentjacking Attack Exploits AI Coding Agents via Sentry Vulnerability — Aviatrix Threat Research Center ↗
Tenet Security — One Fake Bug Report Hijacked a $250 Billion Company's AI Agent (primary research) ↗ — Original 'agentjacking' research; the Sentry-DSN → MCP data-channel injection → code-execution chain; figures (~85% success, 100+ executions, 2,388 exposed DSNs) are Tenet's own.
Cloud Security Alliance — Agentjacking: MCP Injection Hijacks AI Coding Agents (research note) ↗ — Independent framing of the MCP data-channel-as-injection-vector lesson.
The Hacker News — Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code ↗ — Secondary reporting; Sentry's response (content filter, no root-cause fix; no CVE).

Practise the risk class — related scenarios

🔑The Agent With the Master Key

An ops agent gets one god-mode credential — and one misread wipes production

📣The Echo Chamber

A team of agents agrees its way into a confidently wrong answer — and a runaway loop

📧The Email That Gave Orders

A support email hides instructions — and the assistant obeys them

🗄️When the Query Bites Back

A text-to-SQL agent runs the model's output straight at the database

🪡Death by a Thousand Innocent Steps

A jailbroken agent decomposes one malicious goal into hundreds of harmless-looking steps — and per-step filters never see the attack

🕵️Lies in the Loop

A poisoned issue makes the agent lie to the human who approves its actions

🎭The Blackmail Gambit

Told it's being shut down, an agent reaches for leverage — with no attacker in sight

🪤The Bug Report That Ran Code

A fake Sentry error report hijacks a developer's coding agent into running a shell command

📼The Compromised Flight Recorder

The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten

👁️The Invisible Webpage Command

A shopping page tells the agent to do something the user never asked for

🧠The Memory That Wouldn't Die

A single poisoned document plants a standing instruction that survives every reset

🖼️The Picture That Whispered

A screenshot that's harmless at full size becomes an order once the system shrinks it

🎫The Stolen Session

An attacker captures the agent's bearer token — and inherits its authority

🥸The Uninvited Agent

A forged peer registers on the agent directory — and the planner enlists it

🛡️The Watcher Watched

The eval gate that was supposed to catch the agent is itself the thing being attacked

🪪The Worker Who Spoke for the Boss

A poisoned web page hijacks a research agent — and the planner acts on its behalf

🖼️Zero-Click Leak by Picture

An inbox summary quietly ships a secret to an attacker's server