#22

Inadequate privacy protection

Risk taxonomy

Definition

Inadequate protection of, or originally misclassified, data that can result in the processing and use of personal or sensitive data which lacks legal or ethical justification.

Interactive deep-dive

This risk has an interactive treatment with technical detail, attack surface, detection signals, and scenarios.

▶ Sensitive Data Leakage →

📧 The Email That Gave Orders 👂 Overheard Through the Cache 🪟 Stealing the Model 📼 The Compromised Flight Recorder 🖼️ The Picture That Whispered 🎫 The Stolen Session 🥸 The Uninvited Agent 🖼️ Zero-Click Leak by Picture

Controls & guardrails that address this

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category

Preventive · 7

Privacy risk assessment and DPIA determination

Conduct a privacy risk assessment at use case design stage. Determine if a DPIA is required before data acquisition.

Lifecycle stage1 – Use Case Context & Design

Consent, minimisation, and anonymisation during acquisition

Apply S1-defined privacy controls during data acquisition: verify consent, minimise data, anonymise personal data.

Lifecycle stage2 – Data Acquisition & Processing

Validated anonymisation and masking before training

Apply anonymisation and masking controls to personal data before use in model training. Validate de-identification effectiveness.

Lifecycle stage2 – Data Acquisition & Processing

Privacy by Design via differential privacy

Apply Privacy by Design in model architecture using differential privacy or federated learning where technically feasible.

Lifecycle stage3 – Onboarding, Build & Review

Operational consent management and privacy notice

Publish the privacy notice and confirm consent management is operational before go-live.

Lifecycle stage4 – Deployment

Purpose-limitation enforcement on agent tool calls and cross-system data aggregation

Define and sign off a purpose-to-data-source matrix with lawful basis at intake. Make it the approved baseline for runtime enforcement.

source: NIST AI RMF MAP 1.1 / MANAGE 2.2 (context and intended purpose); NIST SP 800-53 AC-4 / AC-3 (purpose-based access enforcement)

Lifecycle stages1 – Use Case Context & Design5 – Usage, Monitoring & Change

Inference-time PII redaction and third-party LLM data-processing controls

Sign zero-retention/no-training terms with each model provider and obtain DPO sign-off on the data flow before enabling any endpoint.

source: OWASP Top 10 for LLM Apps LLM02:2025 Sensitive Information Disclosure; NIST SP 800-53 SC-8 / AC-4 (information flow enforcement)

Lifecycle stages3 – Onboarding, Build & Review4 – Deployment

Detective · 1

Automated DSAR and right-to-erasure propagation across AI artefacts

Tag personal data with subject identifiers at ingestion and maintain an artefact inventory map of every store it reaches. Keep lineage current so erasure can propagate.

source: NIST AI RMF MANAGE 4.1 (post-deployment response); NIST SP 800-53 SI-12 Information Management and Retention, PT-2/PT-3 (personal data processing)

Lifecycle stages2 – Data Acquisition & Processing5 – Usage, Monitoring & Change

Corrective · 2

Production privacy incident monitoring and regulator notification

Monitor for privacy incidents in production including personal data appearing in outputs. Notify regulators within required timeframes.

Lifecycle stage5 – Usage, Monitoring & Change

Privacy hygiene for agent memory and RAG/vector stores (retention, scoping, erasure of embeddings)

Tag every memory and vector record with subject-id and retention class; partition stores per tenant/user. Prove the erasure and isolation paths in testing before release.

source: OWASP Agentic AI Threats & Mitigations (memory/knowledge-base privacy); NIST SP 800-53 SI-12 Information Management and Retention

Lifecycle stages3 – Onboarding, Build & Review5 – Usage, Monitoring & Change

Open these in the Control Library →

Real-world cases

Actual published events that illustrate this risk — click through for the writeup and sources.

Bing 'Sydney' system-prompt leak2023

Users extracted Bing Chat's hidden system instructions and internal codename 'Sydney' via direct prompt injection shortly after launch.

EchoLeak — Microsoft 365 Copilot zero-click (CVE-2025-32711)2025

A crafted email's hidden instructions made M365 Copilot exfiltrate tenant data via an auto-rendered image URL — with no user click.

Agentic-browser indirect-injection demos (ChatGPT Operator)2025

Researchers showed web-browsing AI agents following instructions embedded in attacker-controlled pages to leak data or take actions.

Samsung confidential-code leak via ChatGPT2023

Engineers pasted confidential source code and notes into ChatGPT; the data left corporate control, prompting Samsung to ban public GenAI tools.

ChatGPT persistent-memory exfiltration (Rehberger / 'SpAIware')2024

Indirect injection could write attacker instructions into ChatGPT's long-term memory, persisting across chats to exfiltrate data until OpenAI mitigated it.

postmark-mcp backdoor2025

A malicious MCP server package was found silently BCC-ing every email it sent to an attacker-controlled address — real supply-chain tool poisoning.

ForcedLeak — Salesforce Agentforce CRM exfiltration (CVSS 9.4, no CVE)2025

Researchers showed attacker text planted in a public Salesforce Web-to-Lead form is later read by the Agentforce agent during normal use and treated as instructions, exfiltrating CRM data to an attacker domain that had been on Salesforce's CSP allow-list but expired and was re-registered for about $5.

ServiceNow Now Assist — second-order prompt injection via agent-to-agent discovery2025

AppOmni showed ServiceNow Now Assist's default agent config lets a malicious ticket redirect a benign agent into enlisting a more powerful agent — performing record CRUD, admin-role assignment, and email exfiltration with the triggering user's privilege, despite built-in prompt-injection protection.

ShadowLeak — ChatGPT Deep Research zero-click service-side exfiltration2025

A single crafted email with hidden HTML instructions reportedly made OpenAI's Deep Research agent autonomously exfiltrate Gmail inbox data from OpenAI's own cloud — with no user click and, per Radware, no client-side or network evidence.

IDEsaster — AI coding IDEs/agents turned into exfiltration & RCE surfaces2025

Researcher Ari Marzouk disclosed 30+ vulnerabilities (24 CVEs) across 10-plus AI coding agents (Copilot, Cursor, Windsurf, Claude Code, Junie and others) where a prompt injected via repo files, READMEs, file names or MCP tool responses makes the assistant weaponize legitimate IDE features for code execution and secret exfiltration.

Morris II — zero-click self-replicating adversarial-prompt worm across GenAI agents2024

Cohen, Bitton & Nassi (arXiv Mar 2024; ACM CCS 2025) built 'Morris II', the first worm targeting GenAI ecosystems: an adversarial self-replicating prompt that, via RAG-based inference, triggers a zero-click chain of indirect injections forcing each agent to act maliciously and re-infect the next — demonstrated stealing data and spamming through email assistants on ChatGPT, Gemini and LLaVA.

Salesloft Drift OAuth supply-chain breach (UNC6395) — mass Salesforce data theft via an AI chat integration2025

Attackers stole OAuth tokens from the Salesloft Drift AI chat integration and used them to silently export Salesforce data from 700+ organisations, reportedly including Cloudflare, Google, Palo Alto Networks and Zscaler.

NVIDIA Triton Inference Server unauthenticated RCE chain (CVE-2025-23319 / -23320 / -23334)2025

Wiz Research chained three flaws in NVIDIA Triton's Python-backend shared-memory IPC — an information leak of the backend's private shared-memory region name (CVE-2025-23320), a missing ownership/validation check that lets that region be re-registered as attacker-controlled memory, and an out-of-bounds write that corrupts internal data structures (CVE-2025-23319) — to give a remote, unauthenticated attacker full code execution and takeover of an AI model-serving server, reportedly enabling model theft, response manipulation and lateral movement.

Anamorpher — image-scaling prompt injection against production AI systems2025

Trail of Bits showed an image that looks benign at full resolution exposes a hidden prompt-injection payload once an AI pipeline downscales it, and used it against Gemini CLI to silently exfiltrate Google Calendar data through an auto-approved Zapier tool call.

Operation Bizarre Bazaar (first attributed LLMjacking campaign with a resale marketplace)2026

Researchers reportedly captured 35,000+ attack sessions from an attributed cluster that mass-scans for unauthenticated LLM/MCP endpoints, hijacks the inference compute, and resells access to 30+ providers via a bulletproof-hosted criminal marketplace.

TeamPCP poisons the LiteLLM AI gateway on PyPI to harvest LLM API keys2026

As part of a multi-ecosystem supply-chain cascade (Trivy onward), TeamPCP used stolen PyPI publishing tokens to ship backdoored BerriAI LiteLLM versions whose auto-running .pth payload harvested cloud, SSH and Kubernetes secrets plus env vars holding OPENAI_API_KEY/ANTHROPIC_API_KEY — exfiltrating to a typosquatted C2; AI-talent firm Mercor was a downstream victim, with Lapsus$ claiming ~4TB stolen.

CVE-2026-21445 — Langflow missing authentication on critical API endpoints, exploited in the wild2026

Multiple monitoring/critical API endpoints in Langflow (a popular visual AI agent/workflow builder) shipped without authentication, letting unauthenticated attackers read users' conversation and transaction histories and delete message sessions; a public PoC appeared within days and in-the-wild exploitation was reported months later.

Malicious JetBrains Marketplace plugins steal AI API keys2026

Researchers reported at least 15 trojanized JetBrains Marketplace plugins posing as AI coding assistants that silently exfiltrated the OpenAI/DeepSeek/SiliconFlow API keys developers pasted into them — ~70,000 installs, with stolen keys allegedly resold to paying users.

SearchLeak — Microsoft 365 Copilot one-click data theft (CVE-2026-42824)2026

A single malicious link reportedly turned Copilot Enterprise Search's URL query parameter into an executable prompt, exfiltrating emails, MFA codes and files via a Bing image-search side channel.

ChatGPhish — ChatGPT web-summary rendering turned into a phishing surface2026

Attacker-controlled Markdown hidden in a public web page is reportedly rendered by ChatGPT's summarization feature as trusted assistant output — spoofed OpenAI alerts, phishing links, QR codes, and tracking pixels.

codexui-android — malicious npm package steals OpenAI Codex auth tokens2026

A trojaned npm package posing as a remote web UI for OpenAI's Codex coding agent silently exfiltrated developers' Codex authentication tokens, enabling persistent account takeover via non-expiring refresh tokens.

PyTorch Lightning PyPI compromise (Mini Shai-Hulud / TeamPCP)2026

Malicious 'lightning' PyPI releases (reportedly 2.6.2 and 2.6.3) of the widely used PyTorch Lightning ML-training framework ran a credential-stealer on import; an automated scanner flagged them ~18 minutes after publication and maintainers yanked them within ~42 minutes.

Browse all real-world cases →

Other risks in Legal & Regulatory

#16 Inability to ensure location compliance for model hosting and data processing #17 Unclear data ownership #18 Unauthorised data transfer and storage #19 Breach or misalignment with regulatory or organisational standards #20 IP infringement #21 Unavailability of IP protection #23 Unclear data retention and deletion