Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Real-world QA on RealworldQA v1.0 (test)

75.5Score

GPT-4o

65.93268.41670.973.384Mar 27, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.03
75.5
2026.03
70.2
2026.03
70.2
2026.03
70.1
2026.03
69.8
2026.03
68.2
2026.03
66.3