Definition
Actions taken by an AI agent disrupt or damage connected systems it interacts with (production codebases, third-party APIs, downstream agents) through compromise, malfunction or excessive load. In multi-agent settings, faults can cascade across coordinated agents and amplify the impact.
Interactive deep-dive
This risk surfaces under more than one interactive treatment โ each with its own technical detail, attack surface, detection signals, and scenarios.
โ Suggested sub-risks โ not yet in your taxonomy
Granular vectors recommended under this risk.
A self-hosted inference/serving or MCP endpoint (e.g. Ollama, an OpenAI-compatible API, or an access-control-less MCP server) is reachable from an untrusted network without authentication, allowing third parties to hijack the inference compute (resale/denial-of-wallet/mining), read prompt and conversation state, and โ via co-located over-privileged tools โ pivot into connected systems.
Controls & guardrails that address this
142 proposedGrouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.
Register a safety contract per integration โ pinned version, schemas, side-effect class, latency/error envelope. Gate onboarding on contract review and sign-off.
source: OWASP Top 10 for LLM Apps LLM05:2025 Improper Output Handling; NIST SP 800-53 SA-9 External System ServicesWire the agent tool layer to the CAB calendar at deployment. Test that a declared freeze blocks mutating calls before go-live.
source: NIST SP 800-53 CM-3 Configuration Change Control, CM-5 Access Restrictions for Change; ITIL change-freeze practiceRequire authN/authZ on every inference API and MCP server, bind to private interfaces / front with a gateway, enforce network policy (no public exposure by default), and scope MCP tools to least privilege โ so an exposed endpoint cannot be hijacked for compute resale, prompt/history exfiltration, or lateral movement. Pair with continuous asset discovery so endpoints can't drift back to an open default.
source: Case study: operation-bizarre-bazaar-llmjacking (Pillar Security, 28 Jan 2026)Run consistency and consensus checks across agent or model outputs to flag low-diversity agreement and amplifying error patterns, escalating or breaking the run before sycophantic convergence cascades into action.
source: Interactive-control reconciliation: ctrl-circuit-breaker (partial coverage)Bind the agent's default execution target to non-production environments at design time. Require a separately approved promotion configuration for any production-connected target.
source: NIST SP 800-53 SC-7 Boundary Protection, CM-2 Baseline Configuration; OWASP Agentic AI Threats & Mitigations (cascading failures)Map every dependency failure mode to a defined safe behaviour at design. Require architecture sign-off on the fallback specification before build.
source: NIST SP 800-53 CP-12 Safe Mode, SC-5 Denial-of-Service Protection; NIST AI RMF MANAGE 4.1 (post-deployment response/recovery)Run each agent task in an isolated, network-segmented sandbox scoped to the task's exact needs. Gate onboarding on fault-injection tests proving containment.
source: NIST SP 800-53 SC-7 Boundary Protection, SC-39 Process Isolation; OWASP Agentic AI Threats & Mitigations (sandboxing/containment)Build tracing, detection rules and breaker thresholds into the orchestrator. Prove via fault-injection tests that a failing agent is quarantined within target before release.
source: OWASP Agentic AI Threats & Mitigations (cascading failures); Cloud Security Alliance MAESTRO (multi-agent threat modelling)Engineer mutating actions with idempotency keys, transactions and pre-change snapshots; stage writes rather than committing directly. Gate release on tested dedup and rollback within RPO.
source: NIST SP 800-53 CP-9 System Backup, CP-10 System Recovery and Reconstitution; established idempotency / safe-write engineering practiceCap each agent's rate, volume, concurrency, and spend per downstream dependency. Trip the breaker and fail closed when a ceiling is crossed.
source: NIST SP 800-53 SC-5 Denial-of-Service Protection, SC-6 Resource Availability; OWASP Top 10 for LLM Apps LLM10:2025 Unbounded ConsumptionEnforce hard caps on iterations, depth, wall-clock, and cost per agent run. Terminate the run on cap breach or detected loop signatures.
source: OWASP Top 10 for LLM Apps LLM10:2025 Unbounded Consumption; OWASP Agentic AI Threats & Mitigations (cascading failures)Roll out agent changes via shadow and canary stages gated on connected-system health signals. Auto-halt and roll back to last known-good on threshold breach.
source: NIST SP 800-53 SI-2 Flaw Remediation, CM-3 Configuration Change Control; established progressive-delivery / canary practiceDeploy revocation, tool-cutoff and fleet-halt mechanisms with the release. Test every tier end-to-end and record time-to-effect before go-live.
source: OWASP Agentic AI Threats & Mitigations (kill-switch / containment); NIST AI RMF MANAGE 2.4 (mechanisms to supersede, disengage, or deactivate AI systems)Register each release as a restorable known-good baseline and rehearse rollback at the release gate. Block promotion without a tested restore.
source: ISO/IEC 27031 ICT readiness for business continuity; NIST SP 800-34r1 Contingency Planning (Recovery phase); NIST AI RMF MANAGE 2.4 (mechanisms to supersede/disengage/deactivate)Real-world cases
3Actual published events that illustrate this risk โ click through for the writeup and sources.
Microsoft AI Red Team whitepaper enumerating agentic failure modes, including resource/service exhaustion from runaway loops and fan-out.
Operators and researchers documented cost-amplification attacks against pay-per-token LLM apps, where crafted inputs maximise spend.
Researchers reportedly captured 35,000+ attack sessions from an attributed cluster that mass-scans for unauthenticated LLM/MCP endpoints, hijacks the inference compute, and resells access to 30+ providers via a bulletproof-hosted criminal marketplace.