Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dictionary Insertion Prompting for Multilingual Reasoning on Multilingual Large Language Models

About

There are two shortages in the current Large Language Models (LLMs) era. The first is short of multilingual models, where most LLMs are English-centric and performance is limited on multilingual reasoning. The second is the place of external knowledge to be used, where most retrieved knowledge is prepended to the user queries (maybe sub-optimal). This paper presents a novel and simple yet effective method called \textbf{D}ictionary \textbf{I}nsertion \textbf{P}rompting (\textbf{DIP}). When providing a non-English prompt, DIP looks up a word dictionary and inserts words' English counterparts into the middle of the prompt for LLMs. It then enables better translation into English and better English model thinking steps which leads to obviously better results. We experiment with 10 to 200 languages from FLORES-200.\footnote{The number of languages varies on the datasets, and we experiment with 200 languages on GSM8K as in Appendix} Since there are no adequate datasets, we use the NLLB translator to create synthetic multilingual benchmarks from the existing 4 English reasoning benchmarks such as GSM8K and AQuA. The synthetic benchmarks are translated back into English for quality assurance with manual annotation. Interestingly, the place for injecting the dictionary plays an important factor in the performance gains, and we found that interleaving the dictionary with the original words gives a better performance compared to prepending/appending the dictionary, under the same dictionary constructed.

Hongyuan Lu, Zixuan Li, Wai Lam• 2024

Related benchmarks

TaskDatasetResultRank
Date UnderstandingDate Understanding FLORES-200 10-languages
Performance (kaz_Cyrl)72.4
14
Math ReasoningSVAMP
Kazakh (Cyrl) Accuracy78.33
7
Math Word Problem SolvingSVAMP 10 low-resourced languages FLORES-200 (test)
Kazakh (Cyrillic) Accuracy44
7
Mathematical ReasoningSVAMP 10 low-resourced languages FLORES-200
Kazakh (Cyrl) Score7.33
7
Mathematical ReasoningGSM8K FLORES-200 (10 low-resourced languages) (test)
Kazakh (Cyrl) Accuracy67.93
7
Sports UnderstandingSports Understanding 10 low-resourced languages FLORES-200
Kazakh (Cyrl) Score58.8
7
Date UnderstandingFLORES-200 10 low-resourced languages
Performance Score (kaz_Cyrl)20.4
7
Showing 7 of 7 rows

Other info

Follow for update