Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Deduction Game Gameplay on ONUW
Loading...
61.7
Village Team Win Rate
Ours + ReAct
53.796
55.848
57.9
59.952
Oct 10, 2025
Village Team Win Rate
Werewolf Team Win Rate
Overall Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Village Team Win Rate
Werewolf Team Win Rate
Overall Win Rate
Ours + ReAct
Backend LLM=Gemini-2.5...
2025.10
61.7
39.6
45.1
ReAct
Backend LLM=Gemini-2.5...
2025.10
56.1
40
43.1
Ours + RL-ins.
Backend LLM=Gemini-2.5...
2025.10
55.2
50.6
51.5
Belief
Backend LLM=Gemini-2.5...
2025.10
54.6
40.9
43.5
RL-ins.
Backend LLM=Gemini-2.5...
2025.10
54.2
47.4
48.5
LLM-ins.
Backend LLM=Gemini-2.5...
2025.10
54.1
43.9
45.9
Feedback
Search any
task
Search any
task