Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multilingual Multiple-Choice Reasoning on INCLUDE 44 languages 1.0 (test)

56.9Average Accuracy

Qwen3.5-4B

38.743.42548.1552.875Mar 12, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
56.9
2026.03
52.6
2026.03
52.2
2026.03
48.9
2026.03
45.1
2026.03
39.4