Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mathematical Reasoning on MMLU Math&Logic (train)
Loading...
39.2
R-PSR
SFT
18.92
24.185
29.45
34.715
Sep 29, 2025
R-PSR
T-PSR
MTL
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
R-PSR
T-PSR
MTL
Accuracy
SFT
Backbone=R1-Llama-8B,...
2025.09
39.2
31.1
1,381.7
78.7
R1-Llama-8B (Base)
Backbone=R1-Llama-8B
2025.09
37.8
38.1
1,537.9
72.5
RL (GRPO)
Backbone=R1-Llama-8B,...
2025.09
25.9
26.2
1,854
86.9
FARL
Backbone=R1-Llama-8B,...
2025.09
19.7
23.4
1,914
89.1
Feedback
Search any
task
Search any
task