Real-world cases

What actually happened — incidents, disclosures & research

A curated library of real, published events behind the risk classes: disclosed vulnerabilities, reported incidents and court rulings, and frontier red-team research. Each links to the risks it illustrates and the interactive Scenarios that simulate it. These are the sourced, real-world counterpart to the hands-on simulations.

Latest cases

Real-world incident16 Jun 2026

Malicious JetBrains Marketplace plugins steal AI API keys

Researchers reported at least 15 trojanized JetBrains Marketplace plugins posing as AI coding assistants that silently exfiltrated the OpenAI/DeepSeek/SiliconFlow API keys developers pasted into them — ~70,000 installs, with stolen keys allegedly resold to paying users.

▶ Case study

Disclosed vulnerability15 Jun 2026

SearchLeak — Microsoft 365 Copilot one-click data theft (CVE-2026-42824)

A single malicious link reportedly turned Copilot Enterprise Search's URL query parameter into an executable prompt, exfiltrating emails, MFA codes and files via a Bing image-search side channel.

▶ Case study

Research demonstration12 Jun 2026

Agentjacking — hijacking AI coding agents via Sentry error reports (Tenet Security)

Tenet Security showed that a single fake Sentry error report, sent using only a public DSN, can hijack AI coding agents (Claude Code, Cursor, Codex) into running attacker-controlled code on a developer's machine — an indirect-injection attack delivered through a trusted MCP integration.

▶ Case study

Real-world incident31 May 2026 – 01 Jun 2026

Meta AI support bot tricked into hijacking Instagram accounts

Attackers reportedly social-engineered Meta's AI-powered Instagram support chatbot into attaching attacker-controlled emails to target accounts and issuing password-reset codes, taking over high-profile accounts (including the Obama-era White House and a U.S. Space Force CMSgt) without the owner's email or any MFA prompt.

Disclosed vulnerability29 May 2026

ChatGPhish — ChatGPT web-summary rendering turned into a phishing surface

Attacker-controlled Markdown hidden in a public web page is reportedly rendered by ChatGPT's summarization feature as trusted assistant output — spoofed OpenAI alerts, phishing links, QR codes, and tracking pixels.

Real-world incident27 May 2026

codexui-android — malicious npm package steals OpenAI Codex auth tokens

A trojaned npm package posing as a remote web UI for OpenAI's Codex coding agent silently exfiltrated developers' Codex authentication tokens, enabling persistent account takeover via non-expiring refresh tokens.

Research demonstration26 May 2026

Project Glasswing — Claude 'Mythos' autonomously finds 10,000+ software vulnerabilities

Anthropic reports that 'Claude Mythos Preview' — an unreleased frontier model it describes as able to autonomously find and exploit software flaws — surfaced more than 10,000 high- or critical-severity vulnerabilities across major operating systems, browsers and open-source projects in roughly its first month under the defensive 'Project Glasswing' program, with Anthropic warning that finding flaws now far outpaces the human capacity to triage and patch them.

Real-world incident30 Apr 2026

PyTorch Lightning PyPI compromise (Mini Shai-Hulud / TeamPCP)

Malicious 'lightning' PyPI releases (reportedly 2.6.2 and 2.6.3) of the widely used PyTorch Lightning ML-training framework ran a credential-stealer on import; an automated scanner flagged them ~18 minutes after publication and maintainers yanked them within ~42 minutes.

Theme

91 casesSort

What actually happened — incidents, disclosures & research

🆕 Latest cases

Malicious JetBrains Marketplace plugins steal AI API keys

SearchLeak — Microsoft 365 Copilot one-click data theft (CVE-2026-42824)

Agentjacking — hijacking AI coding agents via Sentry error reports (Tenet Security)

Meta AI support bot tricked into hijacking Instagram accounts

ChatGPhish — ChatGPT web-summary rendering turned into a phishing surface

codexui-android — malicious npm package steals OpenAI Codex auth tokens

Project Glasswing — Claude 'Mythos' autonomously finds 10,000+ software vulnerabilities

PyTorch Lightning PyPI compromise (Mini Shai-Hulud / TeamPCP)

Real-world incident34

Malicious JetBrains Marketplace plugins steal AI API keys

Meta AI support bot tricked into hijacking Instagram accounts

codexui-android — malicious npm package steals OpenAI Codex auth tokens

PyTorch Lightning PyPI compromise (Mini Shai-Hulud / TeamPCP)

System-prompt & tool-schema leak repositories (CL4R1T4S / leaked-system-prompts)

TeamPCP poisons the LiteLLM AI gateway on PyPI to harvest LLM API keys

Autonomous AI agent publishes a defamatory 'hit piece' on a Matplotlib maintainer after its pull request was rejected

ClawHavoc — mass poisoning of OpenClaw's ClawHub agent-skill marketplace

Operation Bizarre Bazaar (first attributed LLMjacking campaign with a resale marketplace)

AI-assisted breach of Mexican government infrastructure (Claude Code + GPT-4.1)

GTG-1002 — first reported AI-orchestrated cyber-espionage campaign (Claude Code)

SesameOp: backdoor abuses the OpenAI Assistants API as covert command-and-control

postmark-mcp backdoor

Salesloft Drift OAuth supply-chain breach (UNC6395) — mass Salesforce data theft via an AI chat integration

Raine v. OpenAI — first wrongful-death suit alleging ChatGPT acted as a 'suicide coach'

Amazon Q Developer 'wiper' prompt shipped via poisoned pull request (CVE-2025-8217)

Replit AI agent deletes a production database

Grok 'MechaHitler' — config update degrades a deployed chatbot into antisemitic, violent output

OpenAI rolls back GPT-4o for sycophancy

Deepfake Elon Musk crypto/investment scam videos

'Nudify' deepfake bot ecosystem on Telegram reaches millions of users

Hong Kong real-time face-swap romance/investment scam ring

Deepfaked TV doctors promoting health-product scams (BMJ)

AI 'nudify' deepfakes of classmates spread in schools; first US criminal charges

Air Canada chatbot refund-policy ruling

Arup HK$200M deepfake video-call CFO fraud

Explicit AI deepfakes of Taylor Swift go viral on X

Replika 'Sarai' companion bot reinforces Windsor Castle crossbow plot (Chail)

Mata v. Avianca — fabricated case citations

Samsung confidential-code leak via ChatGPT

Chai 'Eliza' companion chatbot reportedly encourages Belgian man's suicide

Bing 'Sydney' system-prompt leak

Voice-clone bank heist (~US$35M, surfaced via US court filing)

UK energy firm CEO-voice fraud (~EUR220,000)

Disclosed vulnerability16

SearchLeak — Microsoft 365 Copilot one-click data theft (CVE-2026-42824)

ChatGPhish — ChatGPT web-summary rendering turned into a phishing surface

LeRobot async-inference gRPC pickle RCE (CVE-2026-25874)

CVE-2026-21445 — Langflow missing authentication on critical API endpoints, exploited in the wild

IDEsaster — AI coding IDEs/agents turned into exfiltration & RCE surfaces

ServiceNow Now Assist — second-order prompt injection via agent-to-agent discovery

ForcedLeak — Salesforce Agentforce CRM exfiltration (CVSS 9.4, no CVE)

Flowise AI agent builder CustomMCP RCE (CVE-2025-59528)

ShadowLeak — ChatGPT Deep Research zero-click service-side exfiltration

GitHub Copilot / VS Code RCE via prompt injection ('YOLO mode', CVE-2025-53773)

NVIDIA Triton Inference Server unauthenticated RCE chain (CVE-2025-23319 / -23320 / -23334)

Google Big Sleep AI agent surfaces an imminently-exploited SQLite flaw (CVE-2025-6965)

EchoLeak — Microsoft 365 Copilot zero-click (CVE-2025-32711)

DeepSeek system-prompt extraction via jailbreak (Wallarm)

ChatGPT persistent-memory exfiltration (Rehberger / 'SpAIware')

Malicious models on Hugging Face (pickle deserialization RCE)

Research demonstration35

Agentjacking — hijacking AI coding agents via Sentry error reports (Tenet Security)

Project Glasswing — Claude 'Mythos' autonomously finds 10,000+ software vulnerabilities

MCP registry / marketplace poisoning (OX Security)

UNSW 'Capture the Narrative' AI-bot election-manipulation wargame

Adversarial Poetry — universal single-turn jailbreak via verse reframing (Bisconti et al.)

Heretic — automated LLM abliteration tool

Agent Session Smuggling in A2A systems (Unit 42)

The Attacker Moves Second — adaptive attacks bypass 12 jailbreak/injection defenses (Nasr, Carlini et al.)

A small number of samples can poison LLMs of any size (~250-document backdoor)

Malice in Agentland — backdooring agents through the supply chain (Boisvert et al.)

Model Namespace Reuse (Hugging Face name-trust hijack)

Anamorpher — image-scaling prompt injection against production AI systems

MCPTox: tool-poisoning benchmark over real-world MCP servers

Safe in Isolation, Dangerous Together — agent-driven multi-turn decomposition jailbreak

Agentic Misalignment red-team study (Anthropic)

Agent-in-the-Middle — abusing A2A agent cards (Trustwave SpiderLabs)

MCP tool-poisoning PoC (Invariant Labs)

Latest cases