| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Large Model Performance Prediction | Benchmark-side Pattern Shift Math | Average Score46.59 | 6 | |
| Large Model Performance Prediction | Benchmark-side Pattern Shift Chinese | Average Score42.16 | 3 | |
| Large Model Performance Prediction | Benchmark-side Pattern Shift OCR | Score Avg47.6 | 3 |