| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Task-solving | CL-bench (test) | Overall Score (%)25.8 | 16 | |
| Context Learning | CL-Bench (test) | Overall Score12.85 | 8 | |
| Agentic Long-context Reasoning | CL-bench (test) | Solve Rate26 | 6 | |
| Context Learning Task-Solving | CL-Bench | Overall Score15.8 | 5 | |
| Long Context & Context Learning | CL-Bench | Pass@115.5 | 3 |