Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Social Deduction Game Agent Evaluation on Human Evaluation Study (Good Players)
Loading...
3.9
Contributed Success
GRAIL Agent
3.484
3.592
3.7
3.808
Jun 21, 2025
Contributed Success
Helpful Comments
Updated 6d ago
Evaluation Results
Method
Method
Links
Contributed Success
Helpful Comments
GRAIL Agent
Sample Size (n)=14
2025.06
3.9
3.69
Reasoning Agent
Sample Size (n)=14
2025.06
3.5
3.4
Feedback
Search any
task
Search any
task