Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

General Language Understanding on General Downstream Tasks Aggregate

59.5Average Accuracy

PonderLM-2-Pythia-1.4B

47.12450.33753.5556.763Sep 27, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.09
59.55.4
2025.09
58.54.4
2025.09
56.5-
2025.09
54.1-
2025.09
51.94.3
2025.09
51.94.3
2025.09
50.4-
2025.09
47.6-