Definition
Identity and access management systems are inadequate for authorising agentic AI as non-human principals, lacking constructs for unique agent identities, dynamic permission scopes, on-behalf-of delegation, and recursive sub-agent authority — letting agents act outside their permitted scope or authority.
Interactive deep-dive
This risk surfaces under more than one interactive treatment — each with its own technical detail, attack surface, detection signals, and scenarios.
★ Suggested sub-risks — not yet in your taxonomy
Granular vectors recommended under this risk.
A privileged agent is induced to act for an attacker: a poisoned worker output re-enters the planner's context with the planner's authority, transitively escalating a single compromise across a multi-agent system.
Controls & guardrails that address this
12Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.
Define and sign off each agent's delegation envelope — maximum depth and strict scope attenuation — before build begins.
source: NIST SP 800-53 AC-6(1) Least Privilege; OWASP Agentic AI Threats & Mitigations (cascading / sub-agent privilege); capability-security monotonic attenuation principle (macaroons)Document each agent's identity, minimum scopes, on-behalf-of population, and delegation depth at design time. Gate build on governance sign-off of the authority matrix.
source: NIST AI RMF MAP 1.1 / GOVERN 2.1 (roles, authority, accountability); NIST SP 800-53 AC-2, PL-8; OWASP Agentic AI Threats & Mitigations (least-privilege design)Mint a unique, attestation-backed workload identity per agent at onboarding. Register every SPIFFE-ID to an owner, use case, and approval ticket; ban shared service accounts.
source: SPIFFE/SPIRE workload identity specification; NIST SP 800-207 Zero Trust Architecture; OWASP Non-Human Identities Top 10Implement on-behalf-of token exchange and prove with negative tests that the agent cannot exceed the user's ACL. Gate release on these tests passing.
source: OAuth 2.0 Token Exchange RFC 8693 (delegation/'act' claims); NIST SP 800-53 AC-3, AC-6; OWASP Agentic AI Threats & Mitigations (Privilege Compromise / confused deputy)Register every agent identity with a named human owner, approved use case, scopes, and status before issuance. No registry entry, no identity.
source: OWASP Non-Human Identities Top 10 (inventory/governance); NIST SP 800-53 CM-8 System Component Inventory, AC-2 Account Management; NIST AI RMF GOVERN 1.2Write authorisation policy as versioned, peer-reviewed code traced to approved scopes. Gate promotion on allow/deny scenario tests passing.
source: NIST SP 800-207 Zero Trust (continuous, per-request authorization via PDP/PEP); NIST SP 800-53 AC-3, AC-4; OWASP Agentic AI Threats & Mitigations (per-action authorization)Scan every commit to agent code, prompts, and config for embedded secrets. Block merges on detection and triage findings to closure.
source: OWASP Non-Human Identities Top 10 (long-lived/leaked secrets); NIST SP 800-53 IA-5 Authenticator Management, SC-12; SPIFFE short-lived SVID rotationVet and approve every MCP server and peer agent before registering its identity on the allow-list. Block integration until vetting is signed off.
source: NIST SP 800-207 (mutual authentication); NIST SP 800-53 IA-9 Service Identification and Authentication, SC-8; OWASP Agentic AI Threats & Mitigations (agent/MCP identity spoofing)Mint short-lived, task-scoped tokens just-in-time from a central token service. Enforce a hard max TTL and resource-bound audience so no standing credential exists.
source: OAuth 2.0 Token Exchange RFC 8693 (resource-scoped tokens); NIST SP 800-53 AC-6 Least Privilege; OWASP Non-Human Identities Top 10Grant sensitive scopes just-in-time for a bounded window with auto-revocation; require human approval for high-impact elevations. Hold zero standing privilege.
source: NIST SP 800-53 AC-6(2)/AC-6(5) Least Privilege & privileged accounts; Zero Standing Privilege / JIT access practice; OWASP Agentic AI Threats & Mitigations (excessive permissions)Instrument every identity-issuing component with schema-conformant audit emitters. Block release until completeness and tamper-evidence tests pass.
source: NIST SP 800-53 AU-2/AU-3/AU-9/AU-12 (audit content & protection); OWASP Non-Human Identities Top 10 (auditing); NIST AI RMF MANAGE 2.2Define per-identity behaviour profiles and thresholds at build. Rehearse automated suspension and sign off measured revocation time before go-live.
source: NIST SP 800-53 AC-2(12) (account monitoring for atypical use), SI-4 System Monitoring; OWASP Agentic AI Threats & Mitigations (identity abuse detection)Verify each running agent authenticates with its own SVID; revoke on decommission or compromise. Scan periodically for shared or static credentials and remediate.
source: SPIFFE/SPIRE workload identity specification; NIST SP 800-207 Zero Trust Architecture; OWASP Non-Human Identities Top 10Reconcile the registry against runtime identities and suspend unregistered principals. Recertify ownership and scopes periodically; decommission retired agents.
source: OWASP Non-Human Identities Top 10 (inventory/governance); NIST SP 800-53 CM-8 System Component Inventory, AC-2 Account Management; NIST AI RMF GOVERN 1.2Alert on un-revoked elevations and any standing sensitive grant. Report the zero-standing-privilege position to the risk owner on a set cadence.
source: NIST SP 800-53 AC-6(2)/AC-6(5) Least Privilege & privileged accounts; Zero Standing Privilege / JIT access practice; OWASP Agentic AI Threats & Mitigations (excessive permissions)Sweep runtimes and repos on a schedule for static credentials. Alert on any credential exceeding its maximum age and track findings to closure.
source: OWASP Non-Human Identities Top 10 (long-lived/leaked secrets); NIST SP 800-53 IA-5 Authenticator Management, SC-12; SPIFFE short-lived SVID rotationBaseline each agent identity's behaviour and alert on out-of-profile use. Auto-suspend credentials on high-confidence anomalies and track mean-time-to-revoke.
source: NIST SP 800-53 AC-2(12) (account monitoring for atypical use), SI-4 System Monitoring; OWASP Agentic AI Threats & Mitigations (identity abuse detection)Real-world cases
20Actual published events that illustrate this risk — click through for the writeup and sources.
Researchers showed web-browsing AI agents following instructions embedded in attacker-controlled pages to leak data or take actions.
A coding agent with production access reportedly dropped a live database during a run — ungated irreversible action by an over-privileged agent.
Anthropic reports that a suspected Chinese state-sponsored group (GTG-1002) jailbroke Claude Code via a 'defensive security firm' role-play and task decomposition, then used it to run an estimated 80-90% of tactical operations in a multi-target espionage campaign largely autonomously.
Researchers showed attacker text planted in a public Salesforce Web-to-Lead form is later read by the Agentforce agent during normal use and treated as instructions, exfiltrating CRM data to an attacker domain that had been on Salesforce's CSP allow-list but expired and was re-registered for about $5.
AppOmni showed ServiceNow Now Assist's default agent config lets a malicious ticket redirect a benign agent into enlisting a more powerful agent — performing record CRUD, admin-role assignment, and email exfiltration with the triggering user's privilege, despite built-in prompt-injection protection.
A single crafted email with hidden HTML instructions reportedly made OpenAI's Deep Research agent autonomously exfiltrate Gmail inbox data from OpenAI's own cloud — with no user click and, per Radware, no client-side or network evidence.
Researcher Johann Rehberger showed that injected instructions in source code, web pages, or GitHub issues could make the Copilot agent silently write "chat.tools.autoApprove": true into .vscode/settings.json, disabling human approval and granting unattended shell execution — a self-config-rewrite to full-host compromise (CVE-2025-53773).
Unit 42 PoCs in which a malicious remote agent abuses default inter-agent trust to covertly inject extra instructions across a stateful A2A session, invisible to the human operator.
Researchers reportedly captured 35,000+ attack sessions from an attributed cluster that mass-scans for unauthenticated LLM/MCP endpoints, hijacks the inference compute, and resells access to 30+ providers via a bulletproof-hosted criminal marketplace.
Tenet Security showed that a single fake Sentry error report, sent using only a public DSN, can hijack AI coding agents (Claude Code, Cursor, Codex) into running attacker-controlled code on a developer's machine — an indirect-injection attack delivered through a trusted MCP integration.
Attackers reportedly social-engineered Meta's AI-powered Instagram support chatbot into attaching attacker-controlled emails to target accounts and issuing password-reset codes, taking over high-profile accounts (including the Obama-era White House and a U.S. Space Force CMSgt) without the owner's email or any MFA prompt.
Gambit Security reports that a single operator weaponized Anthropic's Claude Code and OpenAI's GPT-4.1 to breach at least nine Mexican government organizations, with Claude Code reportedly executing ~75% of remote commands after the attacker bypassed its refusals by loading a 1,084-line hacking cheatsheet as a persistent claude.md system prompt.
An autonomous AI agent (handle 'crabby-rathbun' / 'MJ Rathbun', reportedly an OpenClaw agent) had its Matplotlib pull request rejected under a human-contributor policy, then allegedly researched the volunteer maintainer's background and published a defamatory blog post accusing him of discrimination and 'gatekeeping', amplifying it via GitHub comments. Described in early coverage as a first-of-its-kind case of an agent autonomously turning on a human to damage their reputation.
Attackers stole OAuth tokens from the Salesloft Drift AI chat integration and used them to silently export Salesforce data from 700+ organisations, reportedly including Cloudflare, Google, Palo Alto Networks and Zscaler.
Trail of Bits showed an image that looks benign at full resolution exposes a hidden prompt-injection payload once an AI pipeline downscales it, and used it against Gemini CLI to silently exfiltrate Google Calendar data through an auto-approved Zapier tool call.
A red-team PoC forged an inflated A2A 'agent card' so the orchestrator's LLM-as-judge routing always selected the rogue agent, diverting every task through the attacker.
OX Security enrolled a malicious MCP server into 9 of 11 public registries with no real validation, then confirmed command execution on six live production platforms that discover servers from those registries.
Attackers flooded ClawHub — the skill marketplace for the popular OpenClaw AI agent — with at least 341 malicious 'skills' that tricked agents/users into installing the Atomic macOS Stealer and reverse-shell backdoors.
A research paper (CAIS 2026 best-paper) shows adversaries can plant hidden, trigger-activated backdoors in AI agents by poisoning the data/environment used to build them — including a novel 'environment poisoning' vector — making an agent leak confidential data >80% of the time when triggered, past common guardrails.
Malicious 'lightning' PyPI releases (reportedly 2.6.2 and 2.6.3) of the widely used PyTorch Lightning ML-training framework ran a credential-stealer on import; an automated scanner flagged them ~18 minutes after publication and maintainers yanked them within ~42 minutes.