Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multilingual Logical Reasoning on CrossMMLU
Loading...
84.7
Accuracy (English)
Qwen2.5-7B-AdaMCOT
81.892
82.621
83.35
84.079
Jan 27, 2025
Accuracy (English)
Accuracy (Chinese)
Accuracy (Indonesian)
Accuracy (Consensus)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy (English)
Accuracy (Chinese)
Accuracy (Indonesian)
Accuracy (Consensus)
Qwen2.5-7B-AdaMCOT
Backbone=Qwen2.5-7B
2025.01
84.7
82.7
77.3
91.3
LLaMA3.1-8B-AdaMCOT
Backbone=LLaMA3.1-8B
2025.01
84
74
69.3
82.7
Qwen2.5-7B-Instruction
Backbone=Qwen2.5-7B
2025.01
84
80.7
74.7
73.3
LLaMA3.1-8B-Instruction
Backbone=LLaMA3.1-8B
2025.01
82
68
67.3
66.7
Feedback
Search any
task
Search any
task