Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multilingual Question Answering on mGPQA
Loading...
32.9
Accuracy
self-cons (ours)
18.236
22.043
25.85
29.657
May 31, 2026
Accuracy
Updated 1d ago
Evaluation Results
Method
Method
Links
Accuracy
self-cons (ours)
Model Size=14B
2026.05
32.9
S1
Model Size=14B
2026.05
32.1
LIDR
Model Size=14B
2026.05
29.4
self-cons (ours)
Model Size=7B
2026.05
28.9
base
Model Size=14B
2026.05
28.2
S1
Model Size=7B
2026.05
25.7
self-cons (ours)
Model Size=1.5B
2026.05
24.9
base
Model Size=7B
2026.05
24.6
S1
Model Size=1.5B
2026.05
24.5
LIDR
Model Size=7B
2026.05
23.5
base
Model Size=1.5B
2026.05
22.8
LIDR
Model Size=1.5B
2026.05
18.8
Feedback
Search any
task
Search any
task