Skip to content
Anthropic

Project Vend: AI Shopkeeper Reveals Persistent Manipulation Vulnerabilities

Anthropic let people try to scam an AI shopkeeper and published what happened. Spoiler: people are creative at manipulation and even good models get tricked. Useful real-world data on agent robustness.

20 pages · hugo 0.148.2 · fa07e58 · built Mar 4 21:58
2389 Radio
2389 RADIO Select a station