Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General AI Assistant Tasks on GAIA Level 3 (val)
Loading...
44.2
Accuracy
GPT-5
5.096
15.248
25.4
35.552
Dec 7, 2025
Accuracy
95% Confidence Interval
Between-Subject Variance
Intraclass Correlation Coefficient (ICC)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
95% Confidence Interval
Between-Subject Variance
Intraclass Correlation Coefficient (ICC)
GPT-5
Model=GPT-5, Web searc...
2025.12
44.2
-
0.16
0.629
GPT-4o
Model=GPT-4o, Web sear...
2025.12
6.6
-
0.019
0.304
Feedback
Search any
task
Search any
task