Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Qwen2.5

Benchmarks

Task NameDataset NameSOTA ResultTrend
Pairwise Preference ComparisonQwen2.5-3B responses (test)
Avg Preference Score82.7
30
Jailbreak DefenseQwen2.5-7B Adaptive AutoDAN-T attack
ASR30
6
PrefixLM AttentionQwen2.5 72B (q=64, k=8) (1k)
PrefixLM Attention Throughput (TFLOPS)103.61
4
Language Modeling InferenceQwen2.5-7B 8K context length
Decode Latency (ms/token)7.1
4
Showing 4 of 4 rows