Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Dialogue-based Mathematical Problem Solving on MathChat
Loading...
89.7
R1
Qwen2-Math-7B-ScaleQuest
89.492
89.546
89.6
89.654
Oct 24, 2024
R1
R2
R3
Error Correction Rate
Average Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
R1
R2
R3
Error Correction Rate
Average Score
Qwen2-Math-7B-ScaleQuest
Backbone=Qwen2-Math-7B...
2024.10
89.7
61.7
53.5
91.1
72.5
Qwen2-Math-7B-Instruct
Backbone=Qwen2-Math-7B...
2024.10
89.5
62.4
53.5
89.9
72.7
Feedback
Search any
task
Search any
task