🔍AI RiskAtlas
← Risk taxonomy

Watermark & Provenance Evasion

mediumInfrastructure & internals
Also known as: watermark removal, C2PA stripping, provenance laundering

Definition

The labels and invisible watermarks meant to prove whether content is AI-made can be removed, faked, or simply never added — so 'no watermark' doesn't mean 'real', and a watermark can be laundered away by editing or re-recording.

Where it attaches

The system components this risk arises at.

🔖 Content Provenance & Watermark🗜️ VAE / Latent Codec🏗️ Serving Infrastructure🧠 LLM🔬 Synthetic-Media / Deepfake Detector

Detection signals

  • AI-origin content with provenance manifest missing or invalid
  • Watermark detector confidence dropping after re-encode/crop/regenerate
  • Claimed provenance that fails signature verification (spoofing)
  • Reliance on watermark-absence to assert authenticity

Controls & guardrails that address this

4

Grouped by control function, with the AI lifecycle stage(s) to apply each and the other risks it addresses. Filter by control category below.

Control category
Preventive · 1
Serving-stack & provisioning attestation, cache isolationinteractive

Making sure the machinery running the model — and the template used to stamp out new agents — is the real, unmodified version, and that one user's data can't leak into another's through shared shortcuts.

Open these in the Control Library →

Framework mappings

OWASP LLM Top 10
MITRE ATLAS
NIST AI RMF
  • MEASURE 2.7
  • MANAGE 4.1

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗