Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Natural Language Reasoning on BoolQ, ARC-e, ARC-c, WinoGrande, HellaSwag

75.2BoolQ Accuracy

MoEITS

68.294470.087271.8873.6728Apr 12, 2026
Updated 5d ago

Evaluation Results

MethodLinks
2026.04
75.272.336.0167.4661.2762.45
2026.04
75.0871.6841.1366.5453.0861.5
2026.04
69.1452.0229.159.1242.9950.47
2026.04
68.5659.7644.9752.5756.456.45