| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Aggregate BFCL and Meta-Tool | THINKBRAKE | Accuracy86.6 | 20 | 1mo ago | |
| Meta-Tool Multiple | THINKBRAKE | Accuracy91.1 | 20 | 1mo ago | |
| Meta-Tool Single | Phi-4-Reasoning | Accuracy77.7 | 20 | 1mo ago | |
| BFCL Multi-Parallel v2 | THINKBRAKE | Accuracy87.5 | 20 | 1mo ago | |
| BFCL Parallel v2 | Qwen3-4B-Thinking | Accuracy87.5 | 20 | 1mo ago | |
| BFCL Multi-Parallel v1 | Qwen3-4B-Thinking | Accuracy90.5 | 20 | 1mo ago | |
| BFCL Parallel v1 | THINKBRAKE | Accuracy95.5 | 20 | 1mo ago | |
| Real-world tool usage | Compose by Focus | Success Rate90 | 13 | 26d ago | |
| BigCodeBench | Qwen2.5-7B-Ins + ADR | Pass@141.67 | 3 | 2d ago |