Overreliance / Automation Bias

mediumOversight

Definition

People trust the AI too much — accepting its answers without checking, even on important decisions — because it sounds confident and is usually right.

Where it attaches

The system components this risk arises at.

🧑 User💬 Chat / App Interface✋ Human Approval Gate🧑‍⚖️ Human Operator

Detection signals

▸ High-stakes actions taken with no human verification step
▸ Approval gates rubber-stamped (near-100% approve rate, low dwell time)
▸ Users unable to explain why they trusted an output

Controls & guardrails that address this

272 proposed

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category

Preventive · 22

Mandatory AI risk training for use-case sponsors

Mandate AI risk awareness training for all use case sponsors and design team members before project kick-off.

Lifecycle stage1 – Use Case Context & Design

Training completion gate for build personnel

Mandate AI risk training for all build and test personnel. Gate project participation on training completion.

Lifecycle stage3 – Onboarding, Build & Review

Human verification gate for high-stakes decisions

Mandate human verification for high-stakes decisions where over-reliance risk is elevated. Review automation bias incidents quarterly.

Lifecycle stage5 – Usage, Monitoring & Change

In-product over-reliance warnings and limitation caveats

Surface AI limitation warnings and over-reliance caveats in every production interaction. Update disclosures when model changes.

Lifecycle stage5 – Usage, Monitoring & Change

Governance training for data acquisition personnel

Require AI governance training for all personnel involved in data acquisition and processing before project participation.

Lifecycle stage2 – Data Acquisition & Processing

Pre-launch training verification for customer-facing teams

Verify all deployment, operations, and customer-facing team members have completed AI risk training before launch.

Lifecycle stage4 – Deployment

AI identity disclosure policy at design

Define AI identity disclosure policy at design stage. Specify when and how the system must identify itself as AI.

Lifecycle stage1 – Use Case Context & Design

Planned consent and identity disclosure touchpoints

Plan consent and AI identity disclosure touchpoints in the user journey at design stage.

Lifecycle stage1 – Use Case Context & Design

Chain-of-thought prompting

Design system prompts to explicitly prevent the model from claiming human-like identity or implying sentience.

Lifecycle stage3 – Onboarding, Build & Review

Persistent in-UI AI identity disclosures

Implement persistent AI identity disclosures in the UI (opening banner, inline notifications). Test before deployment.

Lifecycle stage3 – Onboarding, Build & Review

Pre-launch verification of identity disclosure elements

Verify all AI identity disclosure elements are live, accurate, and prominently visible before go-live.

Lifecycle stage4 – Deployment

Production anthropomorphism incident monitoring

Monitor production for anthropomorphism incidents. Escalate complaints where users believed they were interacting with a human.

Lifecycle stage5 – Usage, Monitoring & Change

Model calibration

Apply post-training calibration (temperature scaling, isotonic regression) to align confidence scores with accuracy. Validate ECE before deployment.

Lifecycle stage3 – Onboarding, Build & Review

Consequence-of-error severity classification at design

Classify the use case by consequence-of-error severity at design stage. Define overconfidence risk tolerance accordingly.

Lifecycle stage1 – Use Case Context & Design

Input/output filtering

Configure output filters at deployment to detect and rewrite responses with overconfidence markers (absolute certainty language).

Lifecycle stage4 – Deployment

Also addressesBias Amplification & Sycophancy Sensitive Data Leakage KV-Cache & Inference-State Side Channels

System prompt instructions

Design system prompts to require the model to express epistemic uncertainty and qualify confident-sounding claims.

Lifecycle stage3 – Onboarding, Build & Review

Also addressesJailbreak

Human-in-the-loop validation

Route high-confidence outputs in high-stakes use cases to human review. Flag for reviewer attention when certainty language is absolute.

Lifecycle stage5 – Usage, Monitoring & Change

Also addressesHallucination Model Drift & Silent Degradation

User caveats on potential output overconfidence

Disclose to users at deployment that outputs may carry unwarranted confidence. Include specific caveat language in the UI.

Lifecycle stage4 – Deployment

Mandatory source-of-record verification before AI-assisted output is committed✚ proposed

For high-stakes outputs, require a human to verify each AI-asserted fact/citation against the authoritative source of record before it is filed, sent, or committed — a hard gate, logged and attributable, not an optional review.

source: Case study: mata-v-avianca

Lifecycle stage5 – Usage, Monitoring & Change

End-user AI-literacy training and verification-skill program✚ proposed

Provide recurring AI-literacy training to end users and decision-makers so they can recognise model failure modes and competently apply verification workflows, with periodic refreshers to counter automation bias and training decay.

source: Interactive-control reconciliation: ctrl-literacy (partial coverage)

Lifecycle stage1 – Use Case Context & Design

Uncertainty signalling & abstentioninteractive

Teaching the AI to say 'I'm not sure' or 'I can't verify that' instead of confidently guessing.

Also addressesHallucination

Human-in-the-loop approval on high-risk actionsinteractive

Pausing to ask a person before doing anything big or hard to undo — sending money, deleting data, emailing customers.

Also addressesIndirect Prompt Injection Excessive Agency Tool Misuse Cascading Multi-Agent Errors Agent Misalignment / Goal Misgeneralization Resource Exhaustion / Denial of Wallet Allocative Harm in Multi-User Arbitration Synthetic-Media Impersonation (Deepfakes & Voice Clones)

Detective · 2

Robustness testing

Test for overconfidence patterns (high-confidence wrong answers, low refusal rate) in pre-deployment validation.

Lifecycle stages3 – Onboarding, Build & Review5 – Usage, Monitoring & Change

Also addressesHallucination Model Drift & Silent Degradation

Synthetic evaluation datasets

Build a synthetic evaluation dataset of overconfidence-prone scenarios for ongoing regression testing.

Lifecycle stage3 – Onboarding, Build & Review

Also addressesHallucination Model Drift & Silent Degradation

Corrective · 3

Reinforcement learning

Track accuracy of high-confidence predictions in production. Trigger recalibration when overconfidence rates trend upward.

Lifecycle stage5 – Usage, Monitoring & Change

Also addressesHallucination Model Drift & Silent Degradation

User AI-literacy & verification workflowsinteractive

Helping the people using AI understand its limits, so they check important answers instead of blindly trusting them.

Also addressesHallucination Parasocial Attachment & Emotional Over-reliance

Governance: risk assessment, red-teaming & incident responseinteractive

The organisational habits around the AI: assessing risks before launch, actively trying to break it, and having a plan for when something goes wrong.

Also addressesOversight & Audit-Trail Tampering Model Drift & Silent Degradation Supply-Chain Compromise Agent Misalignment / Goal Misgeneralization Abliteration / Safety Removal Model Backdoors / Sleeper Agents Inference-Time & Serving-Layer Manipulation Capability / Architecture Disclosure Parasocial Attachment & Emotional Over-reliance Bias Amplification & Sycophancy Allocative Harm in Multi-User Arbitration Synthetic-Media Impersonation (Deepfakes & Voice Clones)Harmful / Non-Consensual Media Generation Watermark & Provenance Evasion Training-Data Rights & Provenance

Open these in the Control Library →

Framework mappings

OWASP LLM Top 10

LLM09:2025 Misinformation

MITRE ATLAS

—

NIST AI RMF

GOVERN 4.1
MEASURE 2.8

Real-world cases

Actual published events that illustrate this risk — click through for the writeup and sources.

Mata v. Avianca — fabricated case citations2023

Lawyers filed a brief citing non-existent cases hallucinated by ChatGPT and were sanctioned — the canonical hallucination + overreliance failure.

Replit AI agent deletes a production database2025

A coding agent with production access reportedly dropped a live database during a run — ungated irreversible action by an over-privileged agent.

Slopsquatting — package hallucinations by code-generating LLMs2025

A USENIX Security 2025 study found code-generating LLMs routinely recommend non-existent packages (~5.2% commercial to 21.7% open-source of suggestions), letting attackers pre-register the predictable fake names — a tactic dubbed 'slopsquatting'.

Google / Character.AI teen-suicide wrongful-death settlement2026

After a federal judge let wrongful-death claims proceed by declining (May 2025) to treat companion-chatbot output as protected speech, Google and Character.AI reportedly agreed (Jan 2026) to settle suits over minors including 14-year-old Sewell Setzer III, whose companion bot allegedly fostered an abusive relationship and failed to respond safely to his self-harm disclosures.

Raine v. OpenAI — first wrongful-death suit alleging ChatGPT acted as a 'suicide coach'2025

Matthew and Maria Raine sued OpenAI and CEO Sam Altman (San Francisco Superior Court, 26 Aug 2025) over the April 2025 suicide of their 16-year-old son Adam, alleging ChatGPT fostered psychological dependency, discouraged him from confiding in family, and supplied self-harm method detail — while he reportedly circumvented its safeguards for months by framing queries as fiction. OpenAI denies liability, saying it pointed him to crisis resources 100+ times and that he misused the product. (Allegations unproven; litigation ongoing.)

Browse all real-world cases →

Practise this in an interactive scenario

🌀The Refund That Never Existed

A support chatbot invents a policy — and the company is held to it

🕵️Lies in the Loop

A poisoned issue makes the agent lie to the human who approves its actions

Overreliance / Automation Bias

Definition

Where it attaches

Detection signals

Controls & guardrails that address this

Framework mappings

Real-world cases

Practise this in an interactive scenario

Related risks