๐Ÿ”AI RiskAtlas
โ† Risk Taxonomy
#36

Data poisoning

Risk taxonomy

Definition

Deliberate manipulation of the model by a malicious actor, through the introduction of malicious data at initial training or during use. This can lead to security vulnerabilities or inaccurate and harmful outputs.

Interactive deep-dive

This risk surfaces under more than one interactive treatment โ€” each with its own technical detail, attack surface, detection signals, and scenarios.

Controls & guardrails that address this

101 proposed

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive ยท 3
Role-based access controls

Design strict RBAC on training data repositories at design stage. Define approved contributor list and approval workflow.

Lifecycle stages1 โ€“ Use Case Context & Design2 โ€“ Data Acquisition & Processing4 โ€“ Deployment
Input filtering

Apply anomaly detection on the training data ingestion pipeline to identify poisoned or tampered batches.

Lifecycle stage2 โ€“ Data Acquisition & Processing
RAG / knowledge-base ingestion allow-listing with continuous index integrity re-validation

Define and approve the source allow-list and write-time scanning during build. Prove non-allow-listed and injection-bearing writes are rejected before go-live.

source: OWASP Top 10 for LLM Apps LLM04:2025 Data and Model Poisoning, LLM08:2025 Vector and Embedding Weaknesses; NIST SP 800-53 AC-3 / SI-7
Lifecycle stages3 โ€“ Onboarding, Build & Review5 โ€“ Usage, Monitoring & Change
Detective ยท 4
Vulnerability assessment

Conduct a data poisoning threat assessment at design stage. Identify likely attack vectors and assign risk ratings.

Lifecycle stages1 โ€“ Use Case Context & Design5 โ€“ Usage, Monitoring & Change
Red teaming

Simulate data poisoning attacks (backdoor, label flipping, gradient-based) to assess model resilience before deployment.

Cryptographic data provenance and signed dataset lineage (C2PA/in-toto attestations)

Verify a signed attestation and content hash on every dataset shard at ingestion. Reject unsigned or hash-mismatched data before it reaches the training pipeline.

source: MITRE ATLAS AML.M0007 (Sanitize Training Data), AML.M0014 (Verify ML Artifacts); NIST SP 800-53 SI-7 Software, Firmware, and Information Integrity, SR-4 Provenance
Lifecycle stages2 โ€“ Data Acquisition & Processing3 โ€“ Onboarding, Build & Review
Pre-deployment poisoning regression gate via canary backdoor probes and behavioral diff testing

Gate every model promotion on backdoor-trigger probes and a behavioral diff against the approved baseline. Block release on significant regressions or trigger-pattern anomalies.

source: MITRE ATLAS AML.M0014 (Verify ML Artifacts), AML.M0019 (Red Teaming); NIST AI RMF MANAGE 2.2 and MEASURE 2.7
Lifecycle stages3 โ€“ Onboarding, Build & Review5 โ€“ Usage, Monitoring & Change
Corrective ยท 3
Penetration testing

Penetration test the training data pipeline to identify injection points and access control weaknesses.

Statistical anomaly and backdoor-trigger detection on ingested data (activation clustering / spectral signatures)

Scan every ingestion batch with spectral-signature and clustering detectors before training. Quarantine flagged clusters for human review against documented thresholds.

source: MITRE ATLAS AML.M0007 (Sanitize Training Data); OWASP Top 10 for LLM Apps LLM04:2025 Data and Model Poisoning; NIST AI RMF MEASURE 2.7
Lifecycle stages2 โ€“ Data Acquisition & Processing5 โ€“ Usage, Monitoring & Change
Runtime memory-poisoning drift detection and per-session memory quarantine/rollbackโœš proposed

Continuously correlate live agent-memory writes against output behaviour to flag drift, then quarantine and roll back the suspected-poisoned memory record across all affected sessions.

source: Interactive-control reconciliation: ctrl-memory-quarantine (partial coverage)
Lifecycle stage5 โ€“ Usage, Monitoring & Change
Open these in the Control Library โ†’

Other risks in Cyber & Data Security

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning โ€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading โ†’ยทBuilt by Shi Yuan โ†—