Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Assistant Tasks on GAIA (official)
Loading...
78.2
Exact Match
Tendem’s AI agent
62.184
66.342
70.5
74.658
Feb 1, 2026
Exact Match
Updated 1mo ago
Evaluation Results
Method
Method
Links
Exact Match
Tendem’s AI agent
human involvement=none
2026.02
78.2
Flowith
2026.02
78.1
Manus
2026.02
73.4
HAL Generalist Agent
backbone=Claude Sonnet...
2026.02
72.6
ChatGPT Deep Research
2026.02
67.4
HF Open Deep Research
model=GPT-5 medium
2026.02
62.8
Feedback
Search any
task
Search any
task