Share your thoughts, 1 month free Claude Pro on usSee more

General Language Understanding on General Downstream Tasks Aggregate

59.5Average Accuracy

PonderLM-2-Pythia-1.4B

Updated 26d ago

Evaluation Results

Method	Links
PonderLM-2-Pythia-1.4B 2025.09		59.5	5.4
PonderLM-2-Pythia-1.4B 2025.09		58.5	4.4
Aurora 2026.06		57.2	-
Ponder-1.4B 2025.09		56.5	-
Muon 2026.06		56	-
U-NorMuon 2026.06		55.5	-
NorMuon 2026.06		55.1	-
Pythia-1.4B 2025.09		54.1	-
PonderLM-2-Pythia-410M 2025.09		51.9	4.3
PonderLM-2-Pythia-410M 2025.09		51.9	4.3
Ponder-410M 2025.09		50.4	-
Pythia-410M 2025.09		47.6	-
Aurora 2026.06		46.4	-
U-NorMuon 2026.06		45.8	-
NorMuon 2026.06		45.7	-
Muon 2026.06		45.6	-