Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Capability on MMLU-Pro OpenR1-Math Harder
Loading...
71.3
Accuracy
Qwen-4B
70.468
70.684
70.9
71.116
Feb 11, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen-4B
Backbone=Qwen, Paramet...
2026.02
71.3
RePO
Backbone=Qwen, Paramet...
2026.02
71.2
LUFFY
Backbone=Qwen, Paramet...
2026.02
70.5
Feedback
Search any
task
Search any
task