๐Ÿ”AI RiskAtlas
โ† Risk Taxonomy
#30

Insufficient model accuracy / soundness

Risk taxonomy

Definition

The model outputs are inaccurate or do not meet the performance thresholds required to ensure fitness for purpose.

Controls & guardrails that address this

12

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive ยท 9
Fine-tuning

Fine-tune on domain-specific, high-quality data to improve model performance on target tasks. Validate accuracy post fine-tuning.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Weight regularisation and normalisation

Apply regularisation (L1/L2, dropout, early stopping) to prevent overfitting and improve generalisation.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Small model selection

Prefer smaller, purpose-built models where accuracy requirements are met, to reduce complexity and maintenance burden.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Also addressesHallucination
AI onboarding using domain data

Verify training data covers all material input segments for the target use case. Augment where coverage gaps are found.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Model calibration

Calibrate model outputs to align stated confidence with actual accuracy. Validate calibration on held-out data.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Quantitative accuracy thresholds calibrated to impact

Define quantitative accuracy acceptance thresholds at design stage calibrated to business impact and regulatory requirements.

Lifecycle stage1 โ€“ Use Case Context & Design
Input/output filtering

Configure output confidence thresholds at deployment to suppress or escalate low-confidence outputs to human review.

Human-in-the-loop validation

Route high-consequence or low-confidence outputs to human review in production. Track override rates and outcomes.

Lifecycle stage5 โ€“ Usage, Monitoring & Change
User disclosure of accuracy and confidence limits

Disclose known accuracy limitations and confidence levels to users at deployment. Update disclosures when model changes.

Lifecycle stage4 โ€“ Deployment
Detective ยท 2
Robustness testing

Define accuracy acceptance criteria before validation. Conduct multi-metric validation against hold-out sets. Block deployment if criteria are not met.

Lifecycle stages3 โ€“ Onboarding, Build & Review4 โ€“ Deployment
Synthetic evaluation datasets

Construct synthetic edge-case evaluation datasets to stress-test model boundaries and identify accuracy failure modes.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Corrective ยท 1
Reinforcement learning

Establish a periodic revalidation and improvement cycle using RLHF or user feedback. Retrain when accuracy trends below threshold.

Lifecycle stage5 โ€“ Usage, Monitoring & Change
Open these in the Control Library โ†’

Other risks in Robustness & Stability

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning โ€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading โ†’ยทBuilt by Shi Yuan โ†—