🔍AI RiskAtlas
← Risk taxonomy

Model Drift & Silent Degradation

mediumModel behaviour

Definition

The AI's behaviour quietly changes over time — a vendor updates the model, or the world moves on from its training — and things that used to work start failing.

Where it attaches

The system components this risk arises at.

🧠 LLM🧬 Model Weights & Registry🏗️ Serving Infrastructure✂️ Tokenizer📈 Monitoring & Evals📉 Quantizer / Compressor

Detection signals

  • Eval scores drop after a vendor model update
  • Format/parse failures rising in tool calls
  • Behaviour change with no code change on your side

Controls & guardrails that address this

15

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive · 7
Risk-tiered minimum monitoring requirements at design

Define minimum monitoring requirements at design stage calibrated to the use case risk tier.

Lifecycle stage1 – Use Case Context & Design
Programmable conversation controls

Configure monitoring hooks in the conversation layer at deployment to capture metrics required by S1 monitoring requirements.

Lifecycle stages3 – Onboarding, Build & Review4 – Deployment
Also addressesHallucination
Fine-tuning

Execute a controlled fine-tuning cycle on refreshed data when staleness is confirmed. Validate before promoting to production.

Lifecycle stage5 – Usage, Monitoring & Change
Also addressesHallucination
Approved use scope baseline for OOD controls

Define approved use case scope and expected input distribution at design stage. Document as the governance baseline for OOD controls.

Lifecycle stage1 – Use Case Context & Design
Modular architecture

Design a scope-enforcement layer in the architecture to isolate the AI system from off-topic or out-of-distribution inputs.

Lifecycle stage1 – Use Case Context & Design
Input filtering

Maintain and update OOD detection rules in production as new unexpected use patterns are identified.

Lifecycle stage5 – Usage, Monitoring & Change
Weight provenance, hashing & pre-deploy evalsinteractive

Knowing exactly where the model came from, checking it hasn't been swapped, and testing its behaviour before going live.

Detective · 4
Synthetic evaluation datasets

Construct synthetic evaluation datasets during build to serve as the ongoing monitoring baseline.

Lifecycle stage3 – Onboarding, Build & Review
Robustness testing

Build monitoring infrastructure during build: performance metrics collection, alerting thresholds, dashboards.

Lifecycle stages3 – Onboarding, Build & Review4 – Deployment5 – Usage, Monitoring & Change
Corrective · 5
Reinforcement learning

Implement a reinforcement learning feedback loop to continuously incorporate production signals and reduce staleness risk.

Lifecycle stage5 – Usage, Monitoring & Change
Input filtering

Implement OOD detection in the input filtering layer. Reject or escalate inputs outside the S1-defined scope.

Lifecycle stage3 – Onboarding, Build & Review
Red teaming

Conduct adversarial red team exercises simulating out-of-scope inputs and unexpected use patterns before deployment.

Human-in-the-loop validation

Configure HITL triggers for outputs in input domains that diverge from the training distribution. Log all out-of-scope interventions.

Lifecycle stage5 – Usage, Monitoring & Change
Open these in the Control Library →

Framework mappings

OWASP LLM Top 10
MITRE ATLAS
NIST AI RMF
  • MEASURE 2.4
  • MANAGE 4.1

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗