Tampering Below the Weight Hash

A compromised serving stack edits the model's activations — the weight hash never changes

Technique first revealed 04 Dec 2019

🗺️ Inside the Model Inference-Time & Serving-Layer Manipulation

Inside the Model

InstructionsDataActionsControl / decisionFeedback / logs

👆 Click a component to inspect

SetupStep 1 / 6

Provenance is green

The team is careful. They use a well-known open model and, every time it loads, they check the model file's fingerprint against a trusted record. It always matches. They feel sure the model running is exactly the one they vetted.

📝Model load attestation (weights only)log

[load] model=acme-oss-13b version=2.3.1
[verify] sha256(model.safetensors)=4f1c…a9  EXPECTED=4f1c…a9  OK
[verify] cosign signature: VALID (key: acme-ml-release)
[attest] weight provenance: PASS

# Note: this attests the WEIGHTS artifact only.
# It does NOT attest the running inference binary or its memory.

Decision point

You want assurance that the model you vetted is the one actually serving answers. You already hash and sign the weights on every load. What most directly closes the gap an attacker could exploit here?

← / → keys