Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GAIA

Benchmarks

Task NameDataset NameSOTA ResultTrend
General AI Assistant TasksGAIA
Accuracy87.4
266
General AI Assistant tasksGAIA
Avg Performance80.61
54
Agentic EvaluationGAIA
Accuracy28.12
50
Deep searchgaia
Accuracy81.9
37
General AI Assistant TaskGAIA (val)
Level 1 Score94.3
33
Agentic BenchmarksGAIA
Execution Time (min)1.6
25
Deep SearchGAIA text-only (val)
Accuracy70.9
24
Embodied AgenticGAIA
Accuracy0.672
21
Deep ResearchGAIA text-only original (test)
Pass@174.1
20
General AI AssistantGAIA text
GAIA Average Score70.5
19
Multi-turn tool useGAIA
Pass@176.4
18
General AI Assistant ReasoningGAIA Full
Accuracy60.12
18
General AI Assistant ReasoningGAIA (File/Reasoning/Others)
Accuracy56.21
18
General AI Assistant ReasoningGAIA (Web)
Accuracy63.33
18
Agentic ReasoningGAIA (val)
Average Score86.06
17
Inference Time ConsumptionGAIA
Latency (Research And Data)11.2
16
Information-SeekingGAIA 103-question text-only
Pass@175.7
16
Deep ResearchGAIA
Pass@151.46
16
Deep ResearchGAIA
Pass@170.5
15
General AI Assistant TasksGAIA Level 3 original (test)
Performance37.5
15
General AI Assistant TasksGAIA Level 2 original (test)
Perf (%)59.3
15
General AI Assistant TasksGAIA Level 1 original (test)
Performance (%)83.02
15
General AI Assistant TasksGAIA All levels original (test)
Performance (%)63.19
15
General Assistant TasksGAIA
Success Rate46.7
15
General AI Assistant Task CompletionGAIA Text-Only
Accuracy0.874
15
Showing 25 of 63 rows