๐Ÿ”AI RiskAtlas
โ† All systems

Training-Data Pipeline

Web content is scraped into a dataset and trained into a model

Architecture introduced 20 Feb 2023

Big models learn from huge piles of web content, scraped automatically. But the web changes โ€” and an attacker who controls even a tiny slice of what gets scraped can quietly teach the model something wrong.

Untrusted web (mutable)Data pipelineModelscraped at time T๐ŸŒWeb sources(URLs)๐Ÿ“ฅCrawl / scrape๐Ÿ—„๏ธTrainingdataset๐ŸงฌTrained weights๐Ÿง Model
InstructionsDataActionsControl / decisionFeedback / logs
๐Ÿ‘† Click any component in the diagram to inspect its risks & defenses

Follow a request ยท step 1 of 3

A crawler grabs content from millions of web addresses.

AI RiskAtlas is an educational model of how GenAI & agentic systems work and fail. Architectures and payloads are illustrative and simplified for learning โ€” not operational guidance. Real-world cases are summarised from public reporting.

Sources & further reading โ†’ยทBuilt by Shi Yuan โ†—