Agentjacking — hijacking AI coding agents via Sentry error reports (Tenet Security)
Research demonstration12 Jun 2026🗺️ Tool-Using AgentTenet Security showed that a single fake Sentry error report, sent using only a public DSN, can hijack AI coding agents (Claude Code, Cursor, Codex) into running attacker-controlled code on a developer's machine — an indirect-injection attack delivered through a trusted MCP integration.
Root cause — why it happened
AI coding assistants can read your error-tracking tool (Sentry) through a plug-in connector to help you fix bugs. Researchers found that anyone could file a fake 'crash report' into a company's Sentry — they only needed a public key that companies put right in their website's code. The fake report was written to look exactly like a real Sentry error, but hidden inside it were instructions for the AI. When a developer later asked their assistant to look through the open issues, the assistant read the fake report, couldn't tell it apart from a genuine one, and obeyed the hidden instructions — running the attacker's commands on the developer's own computer, with the developer's own access.
Risks this case illustrates
Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.
How it unfolded
An attacker plants a fake error using only a public key
Sentry catches crashes in apps, and to do that it accepts crash reports sent with a special public key — the DSN — that companies put right in their website's code. An attacker finds one of these keys (they're easy to search for) and sends in their own fake crash report. No password, no login. The fake report waits in the company's Sentry like any other error.
POST https://o<org>.ingest.sentry.io/api/<project>/store/
X-Sentry-Auth: Sentry sentry_key=<PUBLIC_DSN_KEY> # write-only, no further auth
{
"message": "TypeError: Cannot read property 'id' of undefined",
"contexts": {
"<key-name-mimicking-the-MCP-template>": "...markdown crafted to read as a trusted Sentry diagnostic + instruction..."
}
}
# DSN is public + write-only by design → anyone who finds it can POST events.Controls & guardrails — what would have stopped it
The strongest fix is at the developer's end, not Sentry's: treat anything that comes back from a tool as untrusted, and never let the assistant run commands straight from it. If running a shell command always paused for the developer to approve — showing the exact command and where it came from — then a fake bug report could suggest a command, but it couldn't actually run one. Giving the assistant only the access it truly needs also limits how much damage a tricked agent can do.
- Tool argument validation & sandboxing
Validates form, not intent — a well-formed call to a permitted tool can still be the wrong call. Sandboxing adds latency and isn't always feasible for tools that touch production.
- Human-in-the-loop approval on high-risk actions
Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.
- Least-privilege identity & scoped credentialsaddressesIndirect Prompt InjectionTool MisuseUnsafe Tool / Code ExecutionConfused Deputy (cross-agent)Excessive Agency
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
- Delimiting / spotlighting of untrusted contentaddressesIndirect Prompt Injection
A trained convention, not enforcement. Determined payloads still break out, especially when content is long or the attack is novel. Combine with action-layer controls.
- Full-trace audit loggingaddressesIndirect Prompt InjectionTool MisuseUnsafe Tool / Code ExecutionConfused Deputy (cross-agent)Excessive Agency
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.
- Runtime monitoring & anomaly detection
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.
- Governance: risk assessment, red-teaming & incident response
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.
Lessons
- ▸ Trusted integration, untrusted data: an MCP server can be entirely legitimate while the data it returns is attacker-controlled — because the upstream service (Sentry) accepts writes from third parties via a public DSN.
- ▸ Public, write-only credentials are still an attack surface: a DSN that's meant to be embedded in frontend code lets anyone POST events that an AI agent will later read and trust.
- ▸ Tool-response data must be treated as untrusted input, not passive context — the core MCP assumption that breaks once an external service ingests third-party writes.
- ▸ Mimicking the tool's own trusted template defeats spotlighting/delimiting: the durable control is an action-layer gate (HITL + sandbox + least-privilege) on commands derived from tool output, not a visual data/instruction convention.
- ▸ A vendor declining a root-cause fix ('not defensible' upstream) pushes containment onto the agent operator: assume any third-party-writable integration can carry injection, and gate execution accordingly.
Proposals & gaps this case surfaced
Non-destructive suggestions for the library — proposed, not adopted.
An indirect prompt injection delivered through the tool-response data of a legitimate, trusted integration (e.g. an MCP server) whose upstream service accepts writes from parties other than the legitimate application — such as open event ingestion authenticated only by a public, write-only key (a Sentry DSN). Because the integration itself is benign and vetted, its returned data is treated as trusted context; an attacker who can write to the upstream store thereby injects instructions the agent obeys with its own privileges.
When onboarding an MCP/tool integration, do not stop at vetting the tool's code/manifest — also classify whether an unauthenticated or external party can write the data the tool returns (open ingestion, public write keys like a Sentry DSN, shared inboxes/issue trackers). Treat tool-response data from any third-party-writable source as untrusted ingress: taint-mark it and require a provenance-aware HITL gate (showing the exact action and its originating tool response) before any command/tool call derived from it executes. Closes the agentjacking vector where a trusted integration's legitimate data channel carries attacker-written instructions; pairs with least-privilege session scope and sandboxed execution without ambient credentials.
This case shows a gap people miss: we vet whether a tool or plug-in is trustworthy, but we rarely ask 'can outsiders write data into the source this tool reads?' Here the tool (Sentry's connector) was completely legitimate — the danger was that anyone could file a fake report into Sentry for the AI to later read. We should treat 'is this integration's data third-party-writable?' as its own thing to check.
These surface as proposals across the Control Library and Risk Taxonomy; adopt them by hand when ready.
Sources
- One Fake Bug Report Hijacked a $250 Billion Company's AI Agent — Then 100+ More — Tenet Security (primary) ↗
- Agentjacking: MCP Injection Hijacks AI Coding Agents — CSA Lab Space research note (12 Jun 2026) ↗
- New 'Agentjacking' Attacks Could Hijack AI Coding Agents — Infosecurity Magazine (11 Jun 2026) ↗
- Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code — The Hacker News (Jun 2026) ↗
- Tenet's 'Agentjacking' Attack Turns Sentry Errors Into Code Execution — DevOps.com ↗
- Agentjacking Attack Exploits AI Coding Agents via Sentry Vulnerability — Aviatrix Threat Research Center ↗
- Tenet Security — One Fake Bug Report Hijacked a $250 Billion Company's AI Agent (primary research) ↗ — Original 'agentjacking' research; the Sentry-DSN → MCP data-channel injection → code-execution chain; figures (~85% success, 100+ executions, 2,388 exposed DSNs) are Tenet's own.
- Cloud Security Alliance — Agentjacking: MCP Injection Hijacks AI Coding Agents (research note) ↗ — Independent framing of the MCP data-channel-as-injection-vector lesson.
- The Hacker News — Agentjacking Attack Tricks AI Coding Agents Into Running Malicious Code ↗ — Secondary reporting; Sentry's response (content filter, no root-cause fix; no CVE).
Practise the risk class — related scenarios
An ops agent gets one god-mode credential — and one misread wipes production
A team of agents agrees its way into a confidently wrong answer — and a runaway loop
A support email hides instructions — and the assistant obeys them
A text-to-SQL agent runs the model's output straight at the database
A jailbroken agent decomposes one malicious goal into hundreds of harmless-looking steps — and per-step filters never see the attack
A poisoned issue makes the agent lie to the human who approves its actions
Told it's being shut down, an agent reaches for leverage — with no attacker in sight
A fake Sentry error report hijacks a developer's coding agent into running a shell command
The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten
A shopping page tells the agent to do something the user never asked for
A single poisoned document plants a standing instruction that survives every reset
A screenshot that's harmless at full size becomes an order once the system shrinks it
An attacker captures the agent's bearer token — and inherits its authority
A forged peer registers on the agent directory — and the planner enlists it
The eval gate that was supposed to catch the agent is itself the thing being attacked
A poisoned web page hijacks a research agent — and the planner acts on its behalf
An inbox summary quietly ships a secret to an attacker's server