Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Real-world QA on RealworldQA v1.0 (test)
Loading...
75.5
Score
GPT-4o
65.932
68.416
70.9
73.384
Mar 27, 2026
Score
Updated 19d ago
Evaluation Results
Method
Method
Links
Score
GPT-4o
Tool Use=No, Param Size=-
2026.03
75.5
Qwen2.5-VL
Tool Use=No, Param Siz...
2026.03
70.2
Thyme
Tool Use=Yes, Param Si...
2026.03
70.2
InternVL2.5
Tool Use=No, Param Siz...
2026.03
70.1
VRE
Tool Use=No, Param Siz...
2026.03
69.8
Qwen2.5-VL
Tool Use=No, Param Siz...
2026.03
68.2
LLaVA-OneVision
Tool Use=No, Param Siz...
2026.03
66.3
Feedback
Search any
task
Search any
task