Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Outcome Reasoning on COCO

77.8M' (F1 Mean)

GPT-5

48.1655.85563.5571.245May 17, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.05
77.870.1
2025.05
75.366.9
2025.05
65.458.7
2025.05
61.354.6
2025.05
60.153.4
2025.05
57.851.5
2025.05
49.342.7