Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

WeMath

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningWeMath
Accuracy80.6
75
Visual Mathematical ReasoningWeMath
Accuracy98.7
53
Multimodal ReasoningWeMath
Accuracy63.8
43
Multimodal Math ReasoningWeMath
Accuracy78
26
Step-wise VerificationWeMath
Macro F163.9
18
Mathematical multi-modal reasoningWeMath
Pass@185.11
13
First Incorrect Step IdentificationWeMath
FISI F1 Score24.9
6
Showing 7 of 7 rows