Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Coding on LiveCodeBench (Pass@1)
Loading...
78.86
Pass@1
DeepSeek-R1-0528
7.9424
26.3537
44.765
63.1763
Aug 4, 2025
Sep 8, 2025
Oct 13, 2025
Nov 17, 2025
Dec 22, 2025
Jan 26, 2026
Mar 2, 2026
Pass@1
Updated 25d ago
Evaluation Results
Method
Method
Links
Pass@1
DeepSeek-R1-0528
Quantization=W4A8-FP8
2026.02
78.86
DeepSeek-R1-0528
Quantization=FP8-Block...
2026.02
77.1
GPT-4o
2026.03
48.8
Llama-3.1-70B-Inst.
2026.03
34.4
Eurus-2-7B-PRIME-R-TAP
training=R-TAP integrated
2026.03
31.8
Eurus-2-7B-PRIME
2026.03
27.5
RLOO
2026.03
26.7
TIC-GRPO
Model=Qwen3-8B
2025.08
21
GSPO
Model=Qwen3-8B
2025.08
20
GRPO
Model=Qwen3-8B
2025.08
19.88
Eurus-2-7B-SFT
2026.03
17.8
TIC-GRPO
Model=Qwen3-1.7B
2025.08
12.11
Qwen2.5-Math-7B-Inst.
2026.03
11.3
GSPO
Model=Qwen3-1.7B
2025.08
11.23
GRPO
Model=Qwen3-1.7B
2025.08
10.67
Feedback
Search any
task
Search any
task