Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Qwen

Benchmarks

Task NameDataset NameSOTA ResultTrend
Model DiscoveryQwen-3B model tree Extended Discovery
Rank233.8
48
Jailbreak DefenseQwen2-VL
ASR0
36
Toxicity DefenseQwen2-VL
Toxicity Score0.05
36
Inference ThroughputQwen3 Query Projection Module NVIDIA A40
Throughput (k tokens/sec)80.63
30
Attention Operator ThroughputQwen2.5 72B (64 Q-heads/8 KV-heads/128 Head-dimension)
Attention Throughput (TFLOPS)222.5
29
Training Throughput AnalysisQwen 7B 2.5
Training Throughput (tokens/s)1,847
28
Function Module DiscoveryQwen 7B-Instruct 2.5
L(F)64.6
24
Function Module DiscoveryQwen 3B Instruct 2.5
L(F)56.9
24
Function Module DiscoveryQwen2.5-1.5B-Instruct
L(F)31.4
24
Model RetrievalQwen-7B model tree (test)
Rank1
21
Model RetrievalQwen-3B model tree (test)
Rank1
21
Jailbreak AttackQwen2.5-7B
Normalized Rate (NR)0.02
20
LLM Training OptimizationQwen 3 1.7B
Time Reduction0.149
18
Fingerprint SimilarityQwen 7B 2.5
Fingerprint Similarity Score0.9979
18
Hallucination TracingQwen
Recall@k83.31
15
Large Language Model EvaluationQwen-32B
MMLU80.81
13
Long-Context GenerationQwen3 Context length (50K)
Throughput Speedup (α)6.02
12
Long-Context GenerationQwen3 Context length 10K
Throughput Speedup (α)2.76
12
LLM fingerprintingQwen 14B 2.5
AUC100
10
LLM fingerprintingQwen 7B 2.5
AUC100
10
Jailbreak AttackQwen2-VL
ASR96.4
10
Jailbreak AttackQwen VL-235B 3
ASR2.32
9
Jailbreak AttackQwen2.5-VL-32B
ASR10.8
9
Jailbreak AttackQwen2.5-VL-7B
ASR98
9
Inference EfficiencyQwen2.5-7B
Throughput (tokens/s)1,480.2
9
Showing 25 of 89 rows