Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Review Generation on Board Game Playtesting Dataset

99.46Factuality

GPT-5.1

58.286468.975779.66590.3543Jan 12, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
99.460.69344.26
2026.01
98.950.65723.56
2026.01
98.860.71174.34
98.280.6483.98
2026.01
97.880.59361.58
2026.01
92.130.67713.56
2026.01
91.560.6853.7
2026.01
59.870.6973.3