Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Human Pairwise Comparison on Gaming Content Familiar Games N=60 samples
Loading...
83.3
Win Rate
MeepleLM
3.636
24.318
45
65.682
Jan 12, 2026
Win Rate
Tie Rate
GPT-5.1 Win Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Win Rate
Tie Rate
GPT-5.1 Win Rate
MeepleLM
Metric=Authenticity Check
2026.01
83.3
10
6.7
MeepleLM
Metric=Opinion Diversity
2026.01
80
6.7
13.3
MeepleLM
Metric=Emotional Reson...
2026.01
76.7
13.3
10
MeepleLM
Metric=Shareability
2026.01
73.3
16.7
10
GPT-5.1
Metric=Opinion Diversity
2026.01
13.3
6.7
80
GPT-5.1
Metric=Emotional Reson...
2026.01
10
13.3
76.7
GPT-5.1
Metric=Shareability
2026.01
10
16.7
73.3
GPT-5.1
Metric=Authenticity Check
2026.01
6.7
10
83.3
Feedback
Search any
task
Search any
task