VLUE: A New Benchmark and Multi-task Knowledge Transfer Learning for Vietnamese Natural Language Understanding

About

The success of Natural Language Understanding (NLU) benchmarks in various languages, such as GLUE for English, CLUE for Chinese, KLUE for Korean, and IndoNLU for Indonesian, has facilitated the evaluation of new NLU models across a wide range of tasks. To establish a standardized set of benchmarks for Vietnamese NLU, we introduce the first Vietnamese Language Understanding Evaluation (VLUE) benchmark. The VLUE benchmark encompasses five datasets covering different NLU tasks, including text classification, span extraction, and natural language understanding. To provide an insightful overview of the current state of Vietnamese NLU, we then evaluate seven state-of-the-art pre-trained models, including both multilingual and Vietnamese monolingual models, on our proposed VLUE benchmark. Furthermore, we present CafeBERT, a new state-of-the-art pre-trained model that achieves superior results across all tasks in the VLUE benchmark. Our model combines the proficiency of a multilingual pre-trained model with Vietnamese linguistic knowledge. CafeBERT is developed based on the XLM-RoBERTa model, with an additional pretraining step utilizing a significant amount of Vietnamese textual data to enhance its adaptation to the Vietnamese language. For the purpose of future research, CafeBERT is made publicly available for research purposes.

Phong Nguyen-Thuan Do, Son Quoc Tran, Phu Gia Hoang, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen• 2024

Related benchmarks

Task	Dataset	Result
Multiple-choice reading comprehension	ViMMRC 2.0	Accuracy57.98	29
Natural Language Inference	ViNLI	Accuracy86.11	17
Information Retrieval	ViWikiFC	Top-1 Accuracy72.6	12
Sentiment Classification	UIT-VSFC (test)	Accuracy94.16	9
Machine Reading Comprehension	UIT-ViQuAD 2.0	EM65.25	9
Topic Classification	UIT-VSFC (test)	Accuracy89.07	9
Emotion Recognition	VSMEC	F1 Score66.12	8
Hate Speech Detection	ViHOS	F1 Score78.56	8
Part-of-Speech Tagging	NIIVTB POS	F1 Score84.04	8
Constructive Speech Detection	UIT-ViCTSD	Accuracy83	8

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord