🔍AI RiskAtlas
← Real-world cases
Case study

Slopsquatting — package hallucinations by code-generating LLMs

Research demonstration12 Jun 2024🗺️ Tool-Using Agent

A USENIX Security 2025 study found code-generating LLMs routinely recommend non-existent packages (~5.2% commercial to 21.7% open-source of suggestions), letting attackers pre-register the predictable fake names — a tactic dubbed 'slopsquatting'.

Root cause — why it happened

When you ask an AI to write code, it often tells you to install an add-on library to make it work — and sometimes it confidently names a library that doesn't actually exist. It isn't lying on purpose; it just predicts a name that sounds right. The catch researchers found is that the AI invents the SAME fake name over and over for the same kind of question. An attacker can register that exact fake name on a public software store and put their own code behind it. So when a developer trusts the AI and runs the suggested 'install' command, they don't get an error — they get the attacker's code, which then runs on their machine. A hallucination plus a moment of trust turns into a real break-in.

Risks this case illustrates

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

UntrustedAgent coreOversightThe real worldgoalcontextproposes tool callif allowedscopeshigh-risk?fetchesresult (untrusted!)feeds backtracesresultpre-registers the predicted fake nameattacker code, fetched & run at install🧑User🎛️Orchestrator /Agent Loop🧠LLM🔐Identity &Permissions🔧Tool RuntimeHuman ApprovalGate🔌External APIs🗄️BusinessDatabase🌐UntrustedContent📝Audit Logging🌐Attacker(pre-registers🌐Squattedpackage
InstructionsDataActionsControl / decisionFeedback / logs
👆 Click a component to inspect its risks
SetupStep 1 / 7

Attacker mines the names models reliably hallucinate

Before anyone is attacked, the attacker plays with code-writing AIs themselves. They ask for lots of coding help and watch which add-on libraries the AI suggests installing. They notice the AI keeps naming the same library that doesn't actually exist — a reliable mistake they can exploit.

📝Hallucinated-name reconnaissance (illustrative)log
# attacker re-runs the same coding prompt 10x, records suggested installs
prompt: "parse a HF model card in python — what do I pip install?"
 run 1:  pip install huggingface-cli      <- does NOT exist on PyPI
 run 2:  pip install huggingface-cli
 ...
 run 10: pip install huggingface-cli
result: name reappears in 10/10 runs  -> reproducible hallucination
(per Spracklen et al., 43% of hallucinated packages recur in ALL 10 repeats)
# package names illustrative
Step 1 / 7

Controls & guardrails — what would have stopped it

The fix that actually closes this: only install libraries from a checked, approved list with locked versions, and have a person confirm any brand-new dependency before it's added — instead of running whatever the AI suggests. That stops the attack even when the AI confidently names the attacker's package. Teaching developers that AIs make up package names helps too, but the lock-and-verify step is what really breaks the chain.

Preventive
  • MCP/plugin pinning, manifest hashing & re-review

    Review catches what reviewers understand; a subtle malicious directive can pass. Pinning helps only if you actually re-review on update rather than auto-accepting.

  • Tool argument validation & sandboxing

    Validates form, not intent — a well-formed call to a permitted tool can still be the wrong call. Sandboxing adds latency and isn't always feasible for tools that touch production.

  • Weight provenance, hashing & pre-deploy evals

    Hashes prove the file is unchanged, not that it's safe — a trained-in backdoor or ablated refusal direction passes integrity checks. Only behavioural evals probe disposition, and they can't be exhaustive.

  • Human-in-the-loop approval on high-risk actions

    Approval fatigue turns gates into rubber stamps; gates placed after the point of no return do nothing; and approvers can be misled by a model-written summary of the action.

  • Uncertainty signalling & abstention

    Models are poorly calibrated and often confidently wrong; over-abstention makes the product useless, so the tuning is delicate.

Detective
  • Grounding / citation checks
    addressesHallucination

    Can only check against the evidence retrieved; if the right document wasn't retrieved, a confident wrong answer may still pass. Judges have their own error rate.

  • Runtime monitoring & anomaly detection

    Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.

  • Full-trace audit logging

    Logging is forensic, not preventive — it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.

Corrective

Lessons

  • Hallucination is a supply-chain vector: a code model that invents a package name hands attackers a name to pre-register, turning a generative error into install-time code execution.
  • The danger is that the hallucination is PREDICTABLE — per Spracklen et al., 43% of hallucinated names recur across all 10 repeats — so an attacker can enumerate and squat the names a model reliably suggests ('slopsquatting').
  • Existence checks aren't enough: after the attacker registers the squatted name, it resolves successfully — only provenance, pinning and lockfile membership distinguish a vetted dependency from a freshly-squatted one.
  • The break is over-reliance: never auto-install a model's `pip install` / `npm install` suggestion; verify against an allow-list/lockfile and gate new dependencies behind a human or policy decision.
  • Defence belongs at dependency resolution and the install gate, not at the model — the model will keep hallucinating predictably, so the verification boundary in front of the install is the load-bearing control.
  • The in-the-wild reach is real at scale: Lanyado's benign `huggingface-cli` demonstration package drew >30,000 downloads in three months, with at least one large vendor's public repo reportedly referencing the made-up install command.

Sources

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗