Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Deduction Game Play on Werewolf against human players (test)
Loading...
3.21
Average Votes
ReAct
1.5148
1.9549
2.395
2.8351
Oct 10, 2025
Average Votes
Average Human Votes
Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Votes
Average Human Votes
Win Rate
ReAct
2025.10
3.21
1.58
20.7
ReCon
2025.10
2.56
1.17
23.1
The Stackelberg Speaker
Base Agent=ReAct
2025.10
2.34
1.08
28.8
Human
2025.10
2.06
-
40
LSA
2025.10
2.01
0.7
39
LSPO
2025.10
1.83
0.54
41.3
The Stackelberg Speaker
Base Agent=LSPO
2025.10
1.58
0.39
44.1
Feedback
Search any
task
Search any
task