Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AI Agent Reasoning and Tool-use on GAIA

78.49Level 1 Score

h2oGPTe

6.636425.290743.94562.5993Feb 7, 2025Apr 15, 2025Jun 21, 2025Aug 27, 2025Nov 2, 2025Jan 8, 2026Mar 17, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.02
78.4964.7840.8265.12-
2025.02
74.3669.2145.4666.13-
74.2969.0647.667.36-
67.9259.330.7757.58-
2026.03
67.958.146.1-59.3
2026.03
58.55034.6-50.3
2025.02
58.0651.5724.4949.17-
2026.03
55.935.824.4-40.2
2026.03
47.131.411.5-33.4
2026.03
46.227.616.3-31.5
2026.03
26.415.13.8-16.9
2026.03
24.511.63.8-15.2
2026.03
19.312.410.3-14.2
2026.03
16.99.30-10.3
2026.03
9.41.20-3.6