Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Research Task on LiveResearchBench (LRB)

96.2Accuracy

GPT 5.4

9.15231.75154.3576.949May 16, 2026
Updated 15d ago

Evaluation Results

MethodLinks
2026.05
96.2-
2026.05
92.5-
2026.05
9113.5
2026.05
90.5-
2026.05
90.5-
2026.05
90-
2026.05
86.9-
2026.05
84.2-
2026.05
82.5-
2026.05
80.1-
2026.05
8066.2
2026.05
77.5-
2026.05
77.5-
2026.05
7562.5
2026.05
68.517.5
2026.05
60.3-
2026.05
51-
2026.05
51-
2026.05
50.6-
2026.05
13.8-
2026.05
12.5-