Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Agent (Tool Use) on OmniGAIA
Loading...
68.9
Score
Gemini-3.1 Pro
32.5
41.95
51.4
60.85
Apr 17, 2026
Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Score
Gemini-3.1 Pro
thinking prompt=withou...
2026.04
68.9
Qwen3.5-Omni-Plus
thinking prompt=withou...
2026.04
57.2
Qwen3.5-Omni-Flash
thinking prompt=withou...
2026.04
33.9
Feedback
Search any
task
Search any
task