Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Outcome Reasoning on CVQA Count

0.792F1 Mean (M')

GPT-5

0.5060.580250.65450.72875May 17, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
0.7920.72
2025.05
0.7620.693
2025.05
0.6750.61
2025.05
0.6350.569
2025.05
0.6220.557
2025.05
0.6010.538
2025.05
0.5170.452