Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal Reasoning on O3-BENCH (test)

0.756Chart Score

INSIGHT-O3

0.169440.321720.4740.62628Dec 21, 2025
Updated 3d ago

Evaluation Results

MethodLinks
2025.12
0.7560.6440.697
2025.12
0.7370.4850.604
2025.12
0.7340.5380.631
0.6810.690.686
0.6770.6960.687
0.6730.6370.654
2025.12
0.6730.5640.615
0.6180.5920.604
2025.12
0.5730.4780.523
0.5540.4850.518
2025.12
0.5440.3390.436
2025.12
0.5240.4050.461
2025.12
0.5150.3850.446
2025.12
0.5110.3680.436
2025.12
0.4930.3210.402
2025.12
0.4910.330.406
2025.12
0.4660.5260.498
2025.12
0.3540.3350.344
2025.12
0.3530.3410.346
2025.12
0.3440.4320.39
2025.12
0.3440.3830.364
2025.12
0.3190.390.357
2025.12
0.3090.2440.274
2025.12
0.3090.5260.423
2025.12
0.2780.5240.408
2025.12
0.2620.2270.243
2025.12
0.2450.2120.228
2025.12
0.2210.3330.28
2025.12
0.2110.1940.202
2025.12
0.1920.3330.265