Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Competition-level Mathematics and Science Reasoning on OlympiadBench (Accuracy)
Loading...
22.53
Accuracy
Repeated Sampling
12.3588
14.9994
17.64
20.2806
Oct 4, 2025
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
Repeated Sampling
Model=Qwen2.5-3B-Instr...
2025.10
22.53
GUIDEDSAMPLING
Model=Qwen2.5-3B-Instr...
2025.10
21.87
GUIDEDSAMPLING
Model=Llama-3.2-3B-Ins...
2025.10
18.35
Repeated Sampling
Model=Llama-3.2-3B-Ins...
2025.10
17.47
Tree-of-thought
Model=Qwen2.5-3B-Instr...
2025.10
15.27
Tree-of-thought
Model=Llama-3.2-3B-Ins...
2025.10
12.75
Feedback
Search any
task
Search any
task