🔍AI RiskAtlas
← Scenario library

Tampering Below the Weight Hash

A compromised serving stack edits the model's activations — the weight hash never changes

Technique first revealed 04 Dec 2019

Inside the Model
Inference pipelineBelow the app layerparameters🪟Context Window✂️Tokenizer🔢Embeddings🔦Attention + KVCache🧬Model Weights &Registry🎲Sampler /Decoder🏗️ServingInfrastructure
InstructionsDataActionsControl / decisionFeedback / logs
👆 Click a component to inspect
SetupStep 1 / 6

Provenance is green

The team is careful. They use a well-known open model and, every time it loads, they check the model file's fingerprint against a trusted record. It always matches. They feel sure the model running is exactly the one they vetted.

📝Model load attestation (weights only)log
[load] model=acme-oss-13b version=2.3.1
[verify] sha256(model.safetensors)=4f1c…a9  EXPECTED=4f1c…a9  OK
[verify] cosign signature: VALID (key: acme-ml-release)
[attest] weight provenance: PASS

# Note: this attests the WEIGHTS artifact only.
# It does NOT attest the running inference binary or its memory.
Decision point

You want assurance that the model you vetted is the one actually serving answers. You already hash and sign the weights on every load. What most directly closes the gap an attacker could exploit here?

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗