Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Problem Solving on Math Benchmarks LIMO curation (test)
Loading...
72.6
Accuracy
LALP
33.808
43.879
53.95
64.021
Oct 5, 2025
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
LALP
Student Model=Qwen2.5-...
2025.10
72.6
Random
Student Model=Qwen2.5-...
2025.10
65.1
GALP
Student Model=Qwen2.5-...
2025.10
63.2
Local Lowest
Student Model=Qwen2.5-...
2025.10
62.3
Original Model
Student Model=Qwen2.5-...
2025.10
44.5
LALP
Student Model=Qwen2.5-...
2025.10
44
GALP
Student Model=Qwen2.5-...
2025.10
41.2
Random
Student Model=Qwen2.5-...
2025.10
40.7
Local Lowest
Student Model=Qwen2.5-...
2025.10
39.9
Original Model
Student Model=Qwen2.5-...
2025.10
35.3
Feedback
Search any
task
Search any
task