Share your thoughts, 1 month free Claude Pro on usSee more

Multi-task Generalization on Multi-task Overall Average

40.5Accuracy

Single Best

Updated 1mo ago

Evaluation Results

Method	Links
Single Best 2026.05		40.5
EvoGM 2026.05		38
CMA 2026.05		37.5
PSO-Merging 2026.05		37.2
Model Swarm 2026.05		37.2
Task Arithmetic 2026.05		36.4
DARE 2026.05		36.4
TIES 2026.05		35.6
Base 2026.05		35.2
Model Soup 2026.05		35
MTL 2026.05		33