Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM

About

In conversational AI research, there's a noticeable trend towards developing models with a larger number of parameters, exemplified by models like ChatGPT. While these expansive models tend to generate increasingly better chat responses, they demand significant computational resources and memory. This study explores a pertinent question: Can a combination of smaller models collaboratively achieve comparable or enhanced performance relative to a singular large model? We introduce an approach termed "blending", a straightforward yet effective method of integrating multiple chat AIs. Our empirical evidence suggests that when specific smaller models are synergistically blended, they can potentially outperform or match the capabilities of much larger counterparts. For instance, integrating just three models of moderate size (6B/13B paramaeters) can rival or even surpass the performance metrics of a substantially larger model like ChatGPT (175B+ paramaters). This hypothesis is rigorously tested using A/B testing methodologies with a large user base on the Chai research platform over a span of thirty days. The findings underscore the potential of the "blending" strategy as a viable approach for enhancing chat AI efficacy without a corresponding surge in computational demands.

Xiaoding Lu, Zongyi Liu, Adian Liusie, Vyas Raina, Vineet Mudupalli, Yuwen Zhang, William Beauchamp• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy (GSM8K)28.4
358
SummarizationXSum (test)
ROUGE-26.9
231
Question AnsweringTriviaQA
Accuracy45.9
210
Arithmetic ReasoningGSM8K
Accuracy81.2
155
Instruction FollowingAlpacaEval
Win Rate26
125
Question AnsweringTriviaQA (test)
Accuracy59.3
121
Question AnsweringSQuAD (test)--
111
SummarizationXsum
ROUGE-211.2
108
Question AnsweringSQuAD
Exact Match54
50
Data-to-text generationWebNLG (test)--
39
Showing 10 of 23 rows

Other info

Follow for update