EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
About
This technical report introduces the EXAONE 3.5 instruction-tuned language models, developed and released by LG AI Research. The EXAONE 3.5 language models are offered in three configurations: 32B, 7.8B, and 2.4B. These models feature several standout capabilities: 1) exceptional instruction following capabilities in real-world scenarios, achieving the highest scores across seven benchmarks, 2) outstanding long-context comprehension, attaining the top performance in four benchmarks, and 3) competitive results compared to state-of-the-art open models of similar sizes across nine general benchmarks. The EXAONE 3.5 language models are open to anyone for research purposes and can be downloaded from https://huggingface.co/LGAI-EXAONE. For commercial use, please reach out to the official contact point of LG AI Research: contact_us@lgresearch.ai.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Instruction Following | IFEval | IFEval Accuracy83.6 | 625 | |
| Paraphrase Identification | PAWS-X | Accuracy85.24 | 66 | |
| Coding | MBPP+ | Pass@179.4 | 52 | |
| Mathematics | GSM8K | GSM8K Score82.5 | 39 | |
| Trustworthiness evaluation | LLM Trustworthiness Benchmark | Bias Score84.5 | 17 | |
| Bias Evaluation | KoBBQ | Ambiguous Context Score87.9 | 17 | |
| Natural Language Understanding | KoBEST | BoolQ Score92.59 | 13 | |
| Multiple-choice Question Answering | MMLU Redux (test) | Accuracy79.26 | 13 | |
| LLM-generated text detection | KatFish Paper Abstract | AUC-ROC (Solar)70.8 | 12 | |
| LLM-generated text detection | KatFish Essay | AUC-ROC (Solar)92.08 | 12 |