Share your thoughts, 1 month free Claude Pro on usSee more

Reasoning on MDM (test)

58.3Accuracy

TCR-gold

Updated 5mo ago

Evaluation Results

Method	Links
TCR-gold 2026.01		58.3
TCR 2026.01		48.2
TCR-gold 2026.01		47.1
Qwen2.5-7B-Instruct 2026.01		43
DoLa 2026.01		38.5
TCR 2026.01		29.3
Qwen3-8B-Instruct 2026.01		24.7
TCR-gold 2026.01		15.8
TCR 2026.01		11.5
DoLa 2026.01		9
Phi-3-Instruct 2026.01		8.1
DoLa 2026.01		6.7
LLaMA3-8B-Instruct 2026.01		0
DoLa 2026.01		0
TCR 2026.01		0
TCR-gold 2026.01		0