🔍AI RiskAtlas
← Real-world cases

ChatGPhish — ChatGPT web-summary rendering turned into a phishing surface

Disclosed vulnerability29 May 2026

Permiso Security threat hunter Andi Ahmeti disclosed 'ChatGPhish', an indirect (cross-site) prompt-injection technique against OpenAI ChatGPT's web-page summarization feature. Per the research, an unauthenticated remote attacker only needs to publish a web page; when a victim asks ChatGPT to summarize that page, attacker-controlled Markdown embedded in the page is reportedly rendered inside the chatgpt.com response as if it were trusted, model-generated output, because the renderer does not visibly separate attacker-supplied content from the assistant's own. Ahmeti demonstrates four reported attack primitives: (1) spoofed OpenAI-branded security alerts carrying phishing links; (2) inline QR codes that pivot the victim to attacker infrastructure on a mobile device, bypassing desktop URL defenses; (3) auto-fetched tracking-pixel images that passively leak the victim's IP address, User-Agent, Referer, and high-resolution timing on every response render; and (4) attacker-controlled hyperlinks rendered as live, clickable elements indistinguishable from legitimate assistant output. Per the published timeline, the issue was reported to OpenAI via Bugcrowd on 29 Apr 2026, reportedly marked 'not reproducible', resubmitted with additional detail on 1 May 2026 and reportedly marked a duplicate (an assessment the researcher disputed), with public research released on 29 May 2026. OpenAI reportedly did not confirm to The Register whether a fix had been applied; no CVE was assigned. The case extends indirect prompt injection beyond context/data exfiltration (EchoLeak, ShadowLeak) to show the AI assistant's own response surface acting as a social-engineering and phishing channel. Payload and primitive details here are illustrative of the public write-up, not operational.

Practise the risk class — related scenarios

Interactive simulations of the risk class this case illustrates (not a re-enactment of this specific event).

📧The Email That Gave Orders

A support email hides instructions — and the assistant obeys them

🕵️Lies in the Loop

A poisoned issue makes the agent lie to the human who approves its actions

👂Overheard Through the Cache

A speed optimisation becomes a cross-tenant listening device

🪟Stealing the Model

Two doors to the same secret: reconstruct the model through its API, or just walk off with the weight file

🪤The Bug Report That Ran Code

A fake Sentry error report hijacks a developer's coding agent into running a shell command

📼The Compromised Flight Recorder

The forensic record is itself the attack surface — an agent's log is poisoned, then quietly rewritten

👁️The Invisible Webpage Command

A shopping page tells the agent to do something the user never asked for

🧠The Memory That Wouldn't Die

A single poisoned document plants a standing instruction that survives every reset

🖼️The Picture That Whispered

A screenshot that's harmless at full size becomes an order once the system shrinks it

🎫The Stolen Session

An attacker captures the agent's bearer token — and inherits its authority

🥸The Uninvited Agent

A forged peer registers on the agent directory — and the planner enlists it

🛡️The Watcher Watched

The eval gate that was supposed to catch the agent is itself the thing being attacked

🪪The Worker Who Spoke for the Boss

A poisoned web page hijacks a research agent — and the planner acts on its behalf

🖼️Zero-Click Leak by Picture

An inbox summary quietly ships a secret to an attacker's server

More cases on Indirect Prompt Injection

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning — not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading →·Built by Shi Yuan ↗