Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Qwen3

Benchmarks

Task NameDataset NameSOTA ResultTrend
Runtime SpeedQwen3 Query Projection Module
Throughput (k tokens/sec)92.57
90
Language ModelingQwen3 (val)
Validation Loss2.493
49
Large Language Model EvaluationQwen3-0.6B Average (test)
Average Performance47.83
38
Output EquivalenceQwen3
Exact Match65.6
13
LLM InferenceQwen3 Samsung Galaxy S25 Ultra 0.6B (test)
Prefill Throughput (min)1,709.9
12
Long-Context GenerationQwen3 Context length (60K)
Throughput Speedup (α)5.89
12
Long-Context GenerationQwen3 Context length 40K
Throughput Speedup (α)5.37
12
Long-Context GenerationQwen3 Context length 30K
Throughput Speedup (α)4.37
12
Long-Context GenerationQwen3 Context length 20K
Throughput Speedup (α)3.73
12
LLM InferenceQwen3 Google Pixel 9 Pro XL 0.6B (test)
Prefill Throughput (min, tokens/sec)591.01
10
Data SelectionQWEN3-4B
Wall-clock Time1
8
Training ThroughputQwen3-30B-A3B workload
Throughput (tokens/s)280,000
7
Kernel-level Attention Speed and Memory AnalysisQwen3-8B model dimensions (H=32, Hk=8, d=128, GQA 4:1) on A100 GPU (test)
Forward Pass Time (ms)27.1
7
Watermark RemovalQwen3 8B
DIPMark14.21
6
Model MergingQwen3-4B-Base Transfer 8 benchmarks
Math Accuracy32.65
6
Training ThroughputQwen3 32B (train)
Training Throughput (128K Seq Len)545.29
5
LLM JailbreakingQwen3 4B Instruct 2507
SRF71
4
Hybrid-Dimension ReconfigurationQwen3-30B-A3B
Reconfiguration Time (s)0.42
4
Language ModelingQwen3-0.6B (val)
Validation Perplexity31.45
3
vLLM Model Deployment and InferenceQwen3-32B inference vLLM
Model Load Time (s)7.641
3
vLLM Model Deployment and InferenceQwen3-14B vLLM (inference)
Model Load Time (s)3.082
3
vLLM Model Deployment and InferenceQwen3-1.7B vLLM (inference)
Model Load Time (s)0.54
3
Optimizer state memory measurementQwen3-32B (train)
Optimizer State Memory (MB)125.9
2
Optimizer state memory measurementQwen3-30B-A3B (train)
Optimizer State Memory (MB)111.4
2
Showing 24 of 24 rows