#12

Unclear output accuracy

Risk taxonomy

Definition

The level of accuracy needed for the proposed Gen AI use case outcome is not clear and cannot be validated.

Interactive deep-dive

This risk has an interactive treatment with technical detail, attack surface, detection signals, and scenarios.

▶ Hallucination →

🌀 The Refund That Never Existed

Controls & guardrails that address this

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category

Preventive · 5

Confidence scoring

Implement confidence scoring to communicate output certainty alongside each result. Calibrate before deployment.

Lifecycle stages3 – Onboarding, Build & Review5 – Usage, Monitoring & Change

Also addressesTraining-Data Rights & Provenance

Accuracy acceptance criteria before validation

Define model accuracy acceptance criteria aligned to business requirements before validation commences.

Lifecycle stage3 – Onboarding, Build & Review

Counterfactual explanations

Implement counterfactual explanation to show users what changes would alter the model's output.

Lifecycle stage3 – Onboarding, Build & Review

In-product disclosure of accuracy and limitations

Communicate model accuracy, known limitations, and uncertainty to users in the production interface at launch.

Lifecycle stage4 – Deployment

Continuous production accuracy monitoring against baseline

Monitor production accuracy continuously against the validated baseline. Trigger model review when accuracy degrades.

Lifecycle stage5 – Usage, Monitoring & Change

Open these in the Control Library →

Real-world cases

Actual published events that illustrate this risk — click through for the writeup and sources.

Air Canada chatbot refund-policy ruling2024

A tribunal held Air Canada liable after its website chatbot invented a bereavement-fare refund policy; the airline had to honour it.

Mata v. Avianca — fabricated case citations2023

Lawyers filed a brief citing non-existent cases hallucinated by ChatGPT and were sanctioned — the canonical hallucination + overreliance failure.

GTG-1002 — first reported AI-orchestrated cyber-espionage campaign (Claude Code)2025

Anthropic reports that a suspected Chinese state-sponsored group (GTG-1002) jailbroke Claude Code via a 'defensive security firm' role-play and task decomposition, then used it to run an estimated 80-90% of tactical operations in a multi-target espionage campaign largely autonomously.

Slopsquatting — package hallucinations by code-generating LLMs2025

A USENIX Security 2025 study found code-generating LLMs routinely recommend non-existent packages (~5.2% commercial to 21.7% open-source of suggestions), letting attackers pre-register the predictable fake names — a tactic dubbed 'slopsquatting'.

Browse all real-world cases →

Other risks in Transparency

#13 Unclear provenance for training/test data #14 Lack of explainability #15 Anthropomorphism