Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Reasoning on Winogrande 5-shot

0.749Normalized Log Accuracy

HATified-SFT

0.596120.635810.67550.71519Mar 16, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
0.749-
2026.03
0.697-
2026.03
0.6875.16
2026.03
0.6775.16
2026.03
0.6765.16
2026.03
0.6734.91
2026.03
0.6354.91
2026.03
0.6024.91