Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Task Execution on TEXTCRAFT-SYNTH Hard (eval)
Loading...
88
Success Rate
Recursive Agent
17.28
35.64
54
72.36
May 7, 2026
Success Rate
Steps
Time (s)
Updated 26d ago
Evaluation Results
Method
Method
Links
Success Rate
Steps
Time (s)
Recursive Agent
Context Window (Train/...
2026.05
88
694
73.3
Single Agent
Context Window (Train/...
2026.05
20
252
180
Feedback
Search any
task
Search any
task