| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Language Modeling | Qwen3 (val) | Validation Loss2.493 | 49 | |
| Output Equivalence | Qwen3 | Exact Match65.6 | 13 | |
| Long-Context Generation | Qwen3 Context length (60K) | Throughput Speedup (α)5.89 | 12 | |
| Long-Context Generation | Qwen3 Context length 40K | Throughput Speedup (α)5.37 | 12 | |
| Long-Context Generation | Qwen3 Context length 30K | Throughput Speedup (α)4.37 | 12 | |
| Long-Context Generation | Qwen3 Context length 20K | Throughput Speedup (α)3.73 | 12 | |
| Training Throughput | Qwen3 32B (train) | Training Throughput (128K Seq Len)545.29 | 5 | |
| vLLM Model Deployment and Inference | Qwen3-32B inference vLLM | Model Load Time (s)7.641 | 3 | |
| vLLM Model Deployment and Inference | Qwen3-14B vLLM (inference) | Model Load Time (s)3.082 | 3 | |
| vLLM Model Deployment and Inference | Qwen3-1.7B vLLM (inference) | Model Load Time (s)0.54 | 3 |