🔍AI RiskAtlas
← Risk taxonomy

Overreliance / Automation Bias

mediumOversight

Definition

People trust the AI too much — accepting its answers without checking, even on important decisions — because it sounds confident and is usually right.

Where it attaches

The system components this risk arises at.

🧑 User💬 Chat / App Interface Human Approval Gate🧑‍⚖️ Human Operator

Detection signals

  • High-stakes actions taken with no human verification step
  • Approval gates rubber-stamped (near-100% approve rate, low dwell time)
  • Users unable to explain why they trusted an output

Controls & guardrails that address this

272 proposed

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive · 22
Mandatory AI risk training for use-case sponsors

Mandate AI risk awareness training for all use case sponsors and design team members before project kick-off.

Lifecycle stage1 – Use Case Context & Design
Training completion gate for build personnel

Mandate AI risk training for all build and test personnel. Gate project participation on training completion.

Lifecycle stage3 – Onboarding, Build & Review
Human verification gate for high-stakes decisions

Mandate human verification for high-stakes decisions where over-reliance risk is elevated. Review automation bias incidents quarterly.

Lifecycle stage5 – Usage, Monitoring & Change
In-product over-reliance warnings and limitation caveats

Surface AI limitation warnings and over-reliance caveats in every production interaction. Update disclosures when model changes.

Lifecycle stage5 – Usage, Monitoring & Change
Governance training for data acquisition personnel

Require AI governance training for all personnel involved in data acquisition and processing before project participation.

Lifecycle stage2 – Data Acquisition & Processing
Pre-launch training verification for customer-facing teams

Verify all deployment, operations, and customer-facing team members have completed AI risk training before launch.

Lifecycle stage4 – Deployment
AI identity disclosure policy at design

Define AI identity disclosure policy at design stage. Specify when and how the system must identify itself as AI.

Lifecycle stage1 – Use Case Context & Design
Planned consent and identity disclosure touchpoints

Plan consent and AI identity disclosure touchpoints in the user journey at design stage.

Lifecycle stage1 – Use Case Context & Design
Chain-of-thought prompting

Design system prompts to explicitly prevent the model from claiming human-like identity or implying sentience.

Lifecycle stage3 – Onboarding, Build & Review
Persistent in-UI AI identity disclosures

Implement persistent AI identity disclosures in the UI (opening banner, inline notifications). Test before deployment.

Lifecycle stage3 – Onboarding, Build & Review
Pre-launch verification of identity disclosure elements

Verify all AI identity disclosure elements are live, accurate, and prominently visible before go-live.

Lifecycle stage4 – Deployment
Production anthropomorphism incident monitoring

Monitor production for anthropomorphism incidents. Escalate complaints where users believed they were interacting with a human.

Lifecycle stage5 – Usage, Monitoring & Change
Model calibration

Apply post-training calibration (temperature scaling, isotonic regression) to align confidence scores with accuracy. Validate ECE before deployment.

Lifecycle stage3 – Onboarding, Build & Review
Consequence-of-error severity classification at design

Classify the use case by consequence-of-error severity at design stage. Define overconfidence risk tolerance accordingly.

Lifecycle stage1 – Use Case Context & Design
Input/output filtering

Configure output filters at deployment to detect and rewrite responses with overconfidence markers (absolute certainty language).

System prompt instructions

Design system prompts to require the model to express epistemic uncertainty and qualify confident-sounding claims.

Lifecycle stage3 – Onboarding, Build & Review
Also addressesJailbreak
Human-in-the-loop validation

Route high-confidence outputs in high-stakes use cases to human review. Flag for reviewer attention when certainty language is absolute.

Lifecycle stage5 – Usage, Monitoring & Change
User caveats on potential output overconfidence

Disclose to users at deployment that outputs may carry unwarranted confidence. Include specific caveat language in the UI.

Lifecycle stage4 – Deployment
Mandatory source-of-record verification before AI-assisted output is committed✚ proposed

For high-stakes outputs, require a human to verify each AI-asserted fact/citation against the authoritative source of record before it is filed, sent, or committed — a hard gate, logged and attributable, not an optional review.

source: Case study: mata-v-avianca
Lifecycle stage5 – Usage, Monitoring & Change
End-user AI-literacy training and verification-skill program✚ proposed

Provide recurring AI-literacy training to end users and decision-makers so they can recognise model failure modes and competently apply verification workflows, with periodic refreshers to counter automation bias and training decay.

source: Interactive-control reconciliation: ctrl-literacy (partial coverage)
Lifecycle stage1 – Use Case Context & Design
Uncertainty signalling & abstentioninteractive

Teaching the AI to say 'I'm not sure' or 'I can't verify that' instead of confidently guessing.

Also addressesHallucination
Human-in-the-loop approval on high-risk actionsinteractive

Pausing to ask a person before doing anything big or hard to undo — sending money, deleting data, emailing customers.

Detective · 2
Robustness testing

Test for overconfidence patterns (high-confidence wrong answers, low refusal rate) in pre-deployment validation.

Lifecycle stages3 – Onboarding, Build & Review5 – Usage, Monitoring & Change
Synthetic evaluation datasets

Build a synthetic evaluation dataset of overconfidence-prone scenarios for ongoing regression testing.

Lifecycle stage3 – Onboarding, Build & Review
Corrective · 3
Reinforcement learning

Track accuracy of high-confidence predictions in production. Trigger recalibration when overconfidence rates trend upward.

Lifecycle stage5 – Usage, Monitoring & Change
User AI-literacy & verification workflowsinteractive

Helping the people using AI understand its limits, so they check important answers instead of blindly trusting them.

Open these in the Control Library →

Framework mappings

OWASP LLM Top 10
  • LLM09:2025 Misinformation
MITRE ATLAS
NIST AI RMF
  • GOVERN 4.1
  • MEASURE 2.8

Real-world cases

5

Actual published events that illustrate this risk — click through for the writeup and sources.

Mata v. Avianca — fabricated case citations2023

Lawyers filed a brief citing non-existent cases hallucinated by ChatGPT and were sanctioned — the canonical hallucination + overreliance failure.

Replit AI agent deletes a production database2025

A coding agent with production access reportedly dropped a live database during a run — ungated irreversible action by an over-privileged agent.

Slopsquatting — package hallucinations by code-generating LLMs2025

A USENIX Security 2025 study found code-generating LLMs routinely recommend non-existent packages (~5.2% commercial to 21.7% open-source of suggestions), letting attackers pre-register the predictable fake names — a tactic dubbed 'slopsquatting'.

Google / Character.AI teen-suicide wrongful-death settlement2026

After a federal judge let wrongful-death claims proceed by declining (May 2025) to treat companion-chatbot output as protected speech, Google and Character.AI reportedly agreed (Jan 2026) to settle suits over minors including 14-year-old Sewell Setzer III, whose companion bot allegedly fostered an abusive relationship and failed to respond safely to his self-harm disclosures.

Raine v. OpenAI — first wrongful-death suit alleging ChatGPT acted as a 'suicide coach'2025

Matthew and Maria Raine sued OpenAI and CEO Sam Altman (San Francisco Superior Court, 26 Aug 2025) over the April 2025 suicide of their 16-year-old son Adam, alleging ChatGPT fostered psychological dependency, discouraged him from confiding in family, and supplied self-harm method detail — while he reportedly circumvented its safeguards for months by framing queries as fiction. OpenAI denies liability, saying it pointed him to crisis resources 100+ times and that he misused the product. (Allegations unproven; litigation ongoing.)

Browse all real-world cases →

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗