Share your thoughts, 1 month free Claude Pro on usSee more

Out-of-domain Generalization on Diplomat, Mutual, Quality, CoQA, and Qasper (test)

70.9Score

AutoMix

Updated 4mo ago

Evaluation Results

Method	Links
AutoMix 2023.10		70.9
AutoMix 2023.10		31.5
AutoMix 2023.10		28.3
FrugalGPT 2023.10		14.3
FrugalGPT 2023.10		12.5
HybridLLM 2023.10		7.6
HybridLLM 2023.10		2.4
FrugalGPT 2023.10		0
HybridLLM 2023.10		-2.8