| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| IHEval | Llama3.1-8B | Language Detection (Reference)100 | 12 | 1mo ago | |
| TEXTCRAFT-SYNTH 8K context Easy (evaluation) | Recursive | Success Rate100 | 4 | 26d ago | |
| 200 TE-labeled queries (test) | MeetMaster-XL | TE-Success@162 | 3 | 3mo ago | |
| TEXTCRAFT-SYNTH Hard (eval) | Recursive Agent | Success Rate88 | 2 | 26d ago | |
| TEXTCRAFT-SYNTH Medium (eval) | Recursive Agent | Success Rate98 | 2 | 26d ago | |
| TEXTCRAFT-SYNTH All (eval) | Recursive Agent | Success Rate96 | 2 | 26d ago | |
| TEXTCRAFT-SYNTH 8K context Hard (evaluation set) | Recursive | Success Rate88 | 2 | 26d ago | |
| TEXTCRAFT-SYNTH 8K context Medium (evaluation set) | Recursive | SR96 | 2 | 26d ago | |
| TEXTCRAFT-SYNTH 8K context All (test) | Recursive | Success Rate95 | 2 | 26d ago |