Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

General AI Assistant Reasoning on GAIA-Text-103 1.0 (test)

76.9L1 Accuracy

Claude-3.7-Sonnet

34.2645.3356.467.47Feb 3, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
76.957.733.362.1
2026.02
61.548.116.749.5
2026.02
56.444.216.745.6
2026.02
56.442.316.744.6
2026.02
53.334.68.338.9
2026.02
51.236.58.338.9
2026.02
51.228.88.334.9
2026.02
46.234.68.335.9
2026.02
35.913.5020.4