Share your thoughts, 1 month free Claude Pro on usSee more

Selective Question Answering on AbstainQA (test)

13Accuracy

EvoGM

Updated 1mo ago

Evaluation Results

Method	Links
EvoGM 2026.05		13
Single Best 2026.05		11.9
Base 2026.05		10.1
CMA 2026.05		7.8
Model Swarm 2026.05		7.1
Task Arithmetic 2026.05		6
PSO-Merging 2026.05		5.9
Model Soup 2026.05		4.8
DARE 2026.05		4.3
TIES 2026.05		2.8
MTL 2026.05		0.3