Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General AI Assistant Tasks on GAIA Level 2 (val)
Loading...
54.2
Accuracy
GPT-5
21.96
30.33
38.7
47.07
Dec 7, 2025
Accuracy
95% Confidence Interval
Between-group Variance
Intraclass Correlation Coefficient (ICC)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
95% Confidence Interval
Between-group Variance
Intraclass Correlation Coefficient (ICC)
GPT-5
Model=GPT-5, Web searc...
2025.12
54.2
-
0.187
0.745
GPT-4o
Model=GPT-4o, Web sear...
2025.12
23.2
-
0.119
0.662
Feedback
Search any
task
Search any
task