Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU (Adversarial Robustness)

100Accuracy under Attack

Reporting-and-penalty mechanism

40.7256.1171.586.89Apr 26, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.04
100100
2026.04
7693
2026.04
5292
2026.04
4497
2026.04
4384