Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Real-world Multimodal Reasoning on RealWorldQA

75.4Accuracy

GPT-4o

40.97649.91358.8567.787Apr 25, 2024May 31, 2024Jul 7, 2024Aug 13, 2024Sep 18, 2024Oct 25, 2024Dec 1, 2024
Updated 3d ago

Evaluation Results

MethodLinks
2024.09
75.4
2024.09
75.4
2024.09
71
2024.09
69
2024.04
68.7
68.7
2024.09
67.8
2024.04
67.5
2024.04
67.5
2024.09
67.5
2024.09
66.8
2024.04
66
2024.09
64.1
2024.09
62.6
2024.09
62.6
2024.09
62.5
2024.04
61.4
2024.09
61.4
2024.09
60.7
2024.09
60.5
2024.09
59.4
2024.09
59.4
2024.09
59
2024.09
57.8
2024.09
57.4
2024.12
57
2024.09
56.9
2024.09
56.5
2024.09
55.8
2024.09
55.8
2024.09
55.7
2024.09
55.6
2024.12
54.6
2024.12
53.7
2024.12
53.7
2024.09
53.3
2024.04
51.9
2024.09
51.2
2024.04
49.8
2024.12
42.3