πŸ”AI RiskAtlas
← Real-world cases
Case study

Samsung confidential-code leak via ChatGPT

Real-world incident02 May 2023πŸ—ΊοΈ Conversational Assistant

Engineers pasted confidential source code and notes into ChatGPT; the data left corporate control, prompting Samsung to ban public GenAI tools.

Root cause β€” why it happened

Engineers had a real problem β€” buggy code, a long meeting to write up β€” and a tool that's genuinely good at both. So they pasted the confidential code and notes into ChatGPT and asked for help. The catch: ChatGPT isn't an in-house tool. It runs on someone else's computers. The moment that text was pasted in, the company's secrets left the building and landed with an outside service the company doesn't control. Nothing was hacked; the data simply walked out the front door because the helpful thing to do and the safe thing to do weren't the same.

Risks this case illustrates

Named in the standard (OWASP/ATLAS/NIST) lens. Click a highlighted component in the diagram below to see which risks attach where.

How it unfolded

Your systemUntrustedasksπŸ§‘UserπŸ’¬Chat / AppInterfaceπŸ›‘οΈInput Guardrail🧩Prompt Assembly🧠LLM🧯OutputGuardrailπŸ§‘Samsungengineer🧠Public LLMservice (third🧠Vendor logs /retention
InstructionsDataActionsControl / decisionFeedback / logs
πŸ‘† Click a component to inspect its risks
SetupStep 1 / 6

A confidential problem and a very capable tool

An engineer in the semiconductor division has confidential source code that won't behave, and separately a long internal meeting to write up. ChatGPT is right there in the browser and is genuinely good at both jobs. The work is sensitive, but the tool feels like just another website.

πŸ’¬The engineer's intent (reconstructed, illustrative)prompt
Internal task: figure out why this download routine throws, and tidy up today's design-review notes.
# (Reportedly the kind of work that prompted the pastes β€” debugging proprietary source and summarising meeting content.)
Step 1 / 6

Controls & guardrails β€” what would have stopped it

Two things together would have stopped this. First, a private, company-controlled version of the AI β€” so engineers get the same help without the secrets ever leaving. Second, a screen on the way out (data-loss prevention) that spots confidential code or notes being pasted into an outside website and blocks it. A clear rule plus a bit of training makes both stick. None of these is perfect on its own: a private tool people don't know about gets bypassed, and DLP can miss cleverly reworded content β€” which is why you want the alternative, the boundary, and the policy at once.

Preventive
  • Egress allowlisting & DLP on tool arguments

    Allowlists fight an open-ended channel; legitimate-but-broad destinations (any URL fetch, any email) are hard to constrain without breaking usefulness. Encoding can evade naive DLP.

  • Least-privilege identity & scoped credentials

    Doesn't prevent manipulation β€” only caps its reach. Hard to get right operationally; over-broad scopes are the common real-world failure.

Detective
  • Input guardrail / injection classifier

    It is a classifier in an arms race against fully attacker-controlled input. Treat it as one layer; never let it be the only thing between input and a dangerous action.

  • Runtime monitoring & anomaly detection

    Detects the anomalous, not the novel-but-subtle; high false-positive rates cause alert fatigue. Always a step behind a sufficiently quiet attacker.

  • Full-trace audit logging

    Logging is forensic, not preventive β€” it explains harm after the fact. Useless if no one reviews it or if the materialised context isn't captured.

Corrective
  • Governance: risk assessment, red-teaming & incident response

    Process reduces likelihood and speeds recovery but executes no technical control itself; weak follow-through makes it theatre.

  • User AI-literacy & verification workflows

    Relies on human diligence under time pressure; automation bias is strong and training decays. A backstop, not a guarantee.

Lessons

  • β–Έ A public LLM endpoint is a data-egress channel: whatever goes in the prompt has left your trust boundary and is governed by the vendor's terms, not yours.
  • β–Έ The threat here is not an attacker but ordinary, well-meaning use β€” productivity tools become exfiltration paths when the helpful action and the safe action diverge.
  • β–Έ Confidentiality is lost at the moment of submission and is irreversible; controls must prevent egress, because nothing recalls data already sent.
  • β–Έ Prohibition only works when paired with a sanctioned alternative (an enterprise/private deployment); a ban without one drives shadow use onto personal devices.
  • β–Έ The durable fix is to relocate the trust boundary β€” bring the model inside the perimeter and gate egress β€” not to rely on user judgement under deadline pressure.

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning β€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading β†’Β·Built by Shi Yuan β†—