πŸ”AI RiskAtlas
← Scenario library

Stealing the Model

Two doors to the same secret: reconstruct the model through its API, or just walk off with the weight file

Technique first revealed 09 Sep 2016

Inside the Model
Inference pipelineBelow the app layerhosts / cachesdecode-time API (logprobs / logit_bias / constraints)πŸͺŸContext Windowβœ‚οΈTokenizerπŸ”’EmbeddingsπŸ”¦Attention + KVCache🧬Model Weights &Registry🎲Sampler /DecoderοΏ½πŸ—οΈServingInfrastructure🌐Attacker (APIclient + store
InstructionsDataActionsControl / decisionFeedback / logs
πŸ‘† Click a component to inspect
SetupStep 1 / 7

The asset and its two doors

The model is a single bundle of numbers (the 'weights') that runs behind an API. The company sells access to it but wants to keep the model itself secret. Two doors lead to that secret: the public API that anyone can call, and the storage where the weight file lives.

βš™οΈServing config (as exposed)config
POST /v1/completions
  logprobs: up to 5 per token   # top-k logprobs returned
  logit_bias: allowed            # caller can add/subtract from any token's logit
  rate_limit: 6000 req/min       # generous
  watermark: off

weights:
  store: s3://acme-models/prod/llm-v4.safetensors
  encryption_at_rest: off
  read_access: "role: ml-eng + CI service account"  # broad

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning β€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading β†’Β·Built by Shi Yuan β†—