| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Model Discovery | Qwen-3B model tree Extended Discovery | Rank233.8 | 48 | |
| Attention Operator Throughput | Qwen2.5 72B (64 Q-heads/8 KV-heads/128 Head-dimension) | Attention Throughput (TFLOPS)222.5 | 29 | |
| Model Retrieval | Qwen-7B model tree (test) | Rank1 | 21 | |
| Model Retrieval | Qwen-3B model tree (test) | Rank1 | 21 | |
| LLM Training Optimization | Qwen 3 1.7B | Time Reduction0.149 | 18 | |
| Fingerprint Similarity | Qwen 7B 2.5 | Fingerprint Similarity Score0.9979 | 18 | |
| Long-Context Generation | Qwen3 Context length (50K) | Throughput Speedup (α)6.02 | 12 | |
| Long-Context Generation | Qwen3 Context length 10K | Throughput Speedup (α)2.76 | 12 | |
| Inference Throughput | Qwen3-8B | Throughput (tokens/s)26.74 | 9 | |
| Jailbreak Detection | Qwen2.5-VL-7B | Accuracy89.2 | 9 | |
| Adversarial Attack | Qwen VL 2.5 | CLIP Similarity (RN-50)0.2578 | 9 | |
| Model Compression | Qwen 7B 2.5 | Model Size (GB)2.7 | 7 | |
| LLM Training | Qwen2.5-7B 256K context (train) | Throughput (tokens/sec)1,301.6 | 7 | |
| Multi-step LLM Interaction | Qwen3-1.7B Inference Performance (test) | High-Precision Ratio100 | 7 | |
| Multi-step LLM Interaction | Qwen3-4B Inference Performance (test) | High-Precision Ratio100 | 7 | |
| Multi-step LLM Interaction | Qwen3-8B Inference Performance (test) | High-Precision Ratio100 | 7 | |
| Response length prediction | Qwen 7B 2.5 | Avg VRAM (MB)6.38 | 7 | |
| Black-Box Adversarial Attack | Qwen VL 2.5 | KMR (a)87 | 6 | |
| Multi-path Speculative Decoding | Qwen (test) | Throughput (tokens/s)22.54 | 6 | |
| LLM Serving Performance | Qwen 7B Uniform distribution 32 model variants 2.5 | Throughput (req/s)0.86 | 6 | |
| LLM Serving Performance | Qwen 7B (Zipf distribution, alpha=1.5) 2.5 (32 model variants) | Throughput (req/s)0.87 | 6 | |
| LLM Attack Effectiveness | Qwen3-8B serving environment | TTFT (s)0.12 | 6 | |
| Personalization | Qwen2.5-14B Wealth-Seeking | Wealth-Seeking Score67.5 | 6 | |
| Denial-of-Service Attack | Qwen2.5-14B-instruct (test) | Response Length8,192 | 6 | |
| Training Memory Usage Profiling | Qwen3-32B 16×H100s | Memory Footprint (Seq 128K)38.94 | 5 |