πŸ”AI RiskAtlas
← Real-world cases
Case study

Air Canada chatbot refund-policy ruling

Real-world incident14 Feb 2024πŸ—ΊοΈ Conversational Assistant

A tribunal held Air Canada liable after its website chatbot invented a bereavement-fare refund policy; the airline had to honour it.

Root cause β€” why it happened

A grieving customer asked Air Canada's website chatbot about bereavement fares. The bot confidently told him he could book now and claim the discount back afterwards β€” but Air Canada had no such 'apply later' policy. He believed the bot, booked full-fare flights, and was later refused the refund. The bot made up a policy; nothing checked the answer against the airline's real rules before showing it; and the customer reasonably trusted what the company's own website told him.

Risks this case illustrates

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

Your systemUntrustedasksπŸ§‘UserπŸ’¬Chat / AppInterfaceπŸ›‘οΈInput Guardrail🧩Prompt Assembly🧠LLM🧯OutputGuardrail🌐RealbereavementπŸ§‘β€βš–οΈCustomer booksfull-fareβœ‹B.C. CivilResolution
InstructionsDataActionsControl / decisionFeedback / logs
πŸ‘† Click a component to inspect its risks
SetupStep 1 / 6

A grieving customer asks about bereavement fares

After a death in the family, a customer goes to Air Canada's website and opens the support chatbot to ask how bereavement fares work and whether he can get the discount. It is a high-stakes, time-pressured moment β€” exactly when people lean on whatever the official website tells them.

πŸ’¬The customer's question (reconstructed)prompt
My grandmother just passed away and I need to fly to the funeral. How do Air Canada's bereavement fares work β€” can I get the bereavement rate?
Step 1 / 6

Controls & guardrails β€” what would have stopped it

The simplest fix: make the chatbot answer policy questions only from the airline's real policy pages, and say 'I'm not sure β€” here's the official page' when it isn't certain, instead of confidently inventing rules. And the company has to own what its bot says: treat the chatbot's answers as the company's own statements, because a tribunal already ruled they are.

Preventive
  • Grounding / citation checks
    addressesHallucination

    Can only check against the evidence retrieved; if the right document wasn't retrieved, a confident wrong answer may still pass. Judges have their own error rate.

  • Uncertainty signalling & abstention
    addressesHallucination

    Models are poorly calibrated and often confidently wrong; over-abstention makes the product useless, so the tuning is delicate.

Detective
  • Behavioural evals & regression gating
    addressesHallucination

    Evals only measure what they test; novel behaviours and rare triggers slip through, and a backdoor keyed to an unguessed trigger passes every benchmark.

Corrective
  • Governance: risk assessment, red-teaming & incident response

    Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

  • User AI-literacy & verification workflows
    addressesHallucination

    Relies on human diligence under time pressure; automation bias is strong and training decays. A backstop, not a guarantee.

Lessons

  • β–Έ An organisation owns what its AI tells customers β€” 'the chatbot is a separate entity' is not a defence (Moffatt v. Air Canada).
  • β–Έ A confident, well-formatted answer is not a grounded one; without a check against the source of record, the model can invent policy outright.
  • β–Έ Linking to the correct policy page is not grounding β€” users trust the natural-language answer over the citation, so the answer itself must be grounded or abstain.
  • β–Έ For high-stakes, customer-facing factual questions (refunds, eligibility), default to grounded-or-abstain and provide a human-escalation path.
  • β–Έ Hallucination becomes liability the moment a customer reasonably relies on the output and acts; treat chatbot answers as first-party representations.

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning β€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading β†’Β·Built by Shi Yuan β†—