Resource Exhaustion / Denial of Wallet
mediumAgency & toolsDefinition
An AI agent gets stuck doing far more work than intended — looping, retrying, spawning more sub-tasks, or being baited into expensive actions — and the bill (compute, API calls, real money) balloons before anyone notices.
This is recommended as a granular sub-risk of #44 Disruption to connected systems (Robustness & Stability · Technology Risk). Operation Bizarre Bazaar shows exposure of the serving plane is the root attack surface, independent of any model/input exploit. It bridges resource-exhaustion (compute theft/denial-of-wallet), supply-chain (a criminal resale chain), data-leakage (prompt/history exposure) and disruption to connected systems (MCP lateral movement) — mapping cleanly under enterprise risk #44 (Disruption to connected systems) with a denial-of-wallet primary effect. Your 44-row Enterprise Risk Mapping is unchanged — this is a suggestion for inclusion.
Where it attaches
The system components this risk arises at.
Detection signals
- ▸ Iteration / tool-call counts far above the task norm
- ▸ Cost or token spend spiking for a session or agent
- ▸ Recursive sub-agent fan-out without convergence
- ▸ Retry storms against an external API
Controls & guardrails that address this
4Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.
Giving the agent only the keys it needs for the current task, not a master key to everything.
Pausing to ask a person before doing anything big or hard to undo — sending money, deleting data, emailing customers.
Automatic stop-switches when AIs get stuck in loops, burn too much money, or start disagreeing with each other.
Live dashboards and alarms that notice unusual behaviour — spikes in errors, weird actions, sudden data access.
Framework mappings
- LLM10:2025 Unbounded Consumption
- AML.T0034 Cost Harvesting
- MEASURE 2.6
- MANAGE 2.2
Real-world cases
3Actual published events that illustrate this risk — click through for the writeup and sources.
Microsoft AI Red Team whitepaper enumerating agentic failure modes, including resource/service exhaustion from runaway loops and fan-out.
Operators and researchers documented cost-amplification attacks against pay-per-token LLM apps, where crafted inputs maximise spend.
Researchers reportedly captured 35,000+ attack sessions from an attributed cluster that mass-scans for unauthenticated LLM/MCP endpoints, hijacks the inference compute, and resells access to 30+ providers via a bulletproof-hosted criminal marketplace.