Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Dialogue Reasoning on MUTUAL
Loading...
0.467
AIBC Score
AutoMix + T
-0.10916
0.04042
0.19
0.33958
Oct 19, 2023
AIBC Score
Updated 4d ago
Evaluation Results
Method
Method
Links
AIBC Score
AutoMix + T
SLM=MISTRAL-7B, Router...
2023.10
0.467
AutoMix + P
SLM=MISTRAL-7B, Router...
2023.10
0.467
FrugalGPT
SLM=MISTRAL-7B
2023.10
0.203
AutoMix + P
SLM=GPT-3.5, Router=PO...
2023.10
0.188
AutoMix + T
SLM=GPT-3.5, Router=Th...
2023.10
0.183
AutoMix + P
SLM=LLAMA2-13B, Router...
2023.10
0.124
AutoMix + T
SLM=LLAMA2-13B, Router...
2023.10
0.118
FrugalGPT
SLM=GPT-3.5
2023.10
0.111
HybridLLM
SLM=MISTRAL-7B
2023.10
0.034
HybridLLM
SLM=LLAMA2-13B
2023.10
0.022
HybridLLM
SLM=GPT-3.5
2023.10
0.018
FrugalGPT
SLM=LLAMA2-13B
2023.10
-0.087
Feedback
Search any
task
Search any
task