Foaster AI Aug 9, 2025Werewolf Game as LLM Social Intelligence BenchmarkUsing Werewolf to benchmark LLMs is clever — you can't win without theory of mind, deception, and reading the room. Standard evals don't test any of that.
Using Werewolf to benchmark LLMs is clever — you can't win without theory of mind, deception, and reading the room. Standard evals don't test any of that.