๐Ÿ”AI RiskAtlas
โ† Risk Taxonomy
#1

Unrepresentative or biased data inputs

Risk taxonomy

Definition

Data is biased against, or unevenly represents, certain individuals or groups of individuals, which can produce biased model outputs.

Controls & guardrails that address this

16

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive ยท 10
Fairness impact assessment at use-case intake

Conduct fairness impact assessment at use case intake. Require governance sign-off on demographic coverage requirements before data acquisition.

Lifecycle stage1 โ€“ Use Case Context & Design
Algorithm re-selection

Select modelling algorithm based on bias risk profile. Prefer algorithms with lower sensitivity to demographic distribution shifts.

Lifecycle stages1 โ€“ Use Case Context & Design2 โ€“ Data Acquisition & Processing
Model separation

Design separate model modules for distinct demographic populations where data characteristics diverge materially.

Lifecycle stage1 โ€“ Use Case Context & Design
In-processing techniques

Apply adversarial debiasing or fairness constraints during model training. Validate against fairness metrics before sign-off.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Hyperparameter tuning

Tune hyperparameters with fairness-aware search objectives. Reject configurations with demographic disparity exceeding threshold.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Model customisation

Fine-tune on a curated, representative dataset verified for demographic balance. Document coverage breakdown before training.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Decision threshold adjustment

Calibrate decision thresholds per demographic group to equalise error rates. Validate calibration before deployment sign-off.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Post-processing techniques

Apply post-processing adjustments (re-ranking, score recalibration) to correct fairness gaps identified in validation.

Lifecycle stage3 โ€“ Onboarding, Build & Review
System prompt instructions

Design system prompts to include explicit fairness requirements: instruct the model to avoid stereotyping and demographic assumptions.

Lifecycle stage3 โ€“ Onboarding, Build & Review
User disclosure of training data bias

Disclose to all users at deployment that model outputs may reflect training data biases. Include specific limitation caveat.

Lifecycle stage4 โ€“ Deployment
Detective ยท 1
Model evaluation

Conduct comprehensive fairness validation across demographic groups before deployment. Treat material disparity as a blocking defect.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Corrective ยท 5
Input/output filtering

Screen training data for demographic gaps using automated pipeline checks. Reject batches failing representation thresholds.

Pre-deployment adversarial bias testing by demographic

Execute adversarial bias testing using targeted demographic test cases before deployment.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Human-in-the-loop validation

Conduct structured human expert review of model outputs stratified across demographic groups before deployment.

Lifecycle stage3 โ€“ Onboarding, Build & Review
Model monitoring

Continuously monitor fairness metrics across demographic groups in production. Trigger model review when bias drift is detected.

Lifecycle stage5 โ€“ Usage, Monitoring & Change
User feedback and iterative improvement

Monitor fairness metric trends by demographic group in production. Use feedback to drive targeted debiasing in model updates.

Lifecycle stage5 โ€“ Usage, Monitoring & Change
Also addressesJailbreak
Open these in the Control Library โ†’

Other risks in Fairness & Bias

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning โ€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading โ†’ยทBuilt by Shi Yuan โ†—