Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Code Reasoning on LiveCodeBench (Pass@1)
Loading...
88.3
Pass@1 Accuracy
REBALANCE
16.748
35.324
53.9
72.476
Nov 5, 2025
Nov 26, 2025
Dec 17, 2025
Jan 7, 2026
Jan 28, 2026
Feb 18, 2026
Mar 12, 2026
Pass@1 Accuracy
Token Count
Updated 9d ago
Evaluation Results
Method
Method
Links
Pass@1 Accuracy
Token Count
REBALANCE
Backbone=QwQ-32B
2026.03
88.3
5,000
Baseline
Backbone=QwQ-32B
2026.03
87.5
6,000
REBALANCE
Backbone=Qwen3-14B
2026.03
84.6
6,000
Baseline
Backbone=Qwen3-14B
2026.03
83.5
7,000
SnapStream
Maximum sequence lengt...
2025.11
78.08
-
DeepSeek-R1-0528
Maximum sequence lengt...
2025.11
73.97
-
DeepSeek-R1-0528
Maximum sequence lengt...
2025.11
71.43
-
REBALANCE
Backbone=DeepSeek-R1-D...
2026.03
46.5
8,000
Baseline
Backbone=DeepSeek-R1-D...
2026.03
44
9,000
REBALANCE
Backbone=DeepSeek-R1-D...
2026.03
22.5
11,000
Baseline
Backbone=DeepSeek-R1-D...
2026.03
19.5
12,000
Feedback
Search any
task
Search any
task