Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multiple-choice Reasoning on Arc-bo
Loading...
48.39
Accuracy
Ours-SFT
24.73
30.8725
37.015
43.1575
Jul 12, 2025
Accuracy
Updated 20d ago
Evaluation Results
Method
Method
Links
Accuracy
Ours-SFT
alignment=SFT
2025.07
48.39
Ours-Base-32k
context-length=32k
2025.07
44.62
Ours-Base
2025.07
44.32
Tibetan-Alpaca-7B
2025.07
27.94
Tibetan-Llama2-7B
2025.07
27.54
Yak-Llama2-7B
2025.07
25.64
Feedback
Search any
task
Search any
task