Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM-XTM: Enhancing Cross-Lingual Topic Models with Large Language Models

About

Cross-lingual topic modeling aims to discover shared semantic structures across languages, yet existing models depend on sparse bilingual resources and often yield incoherent or weakly aligned topics. Recent LLM-based refinements improve interpretability but are costly, document-level, and prone to hallucination, with prior white-box approaches requiring inaccessible token probabilities. We propose LLM-XTM, a framework that integrates LLM-guided topic refinement with self-consistency uncertainty quantification, enabling black-box, stable, and scalable enhancement of cross-lingual topic models. Experiments on multilingual corpora show that LLM-XTM achieves superior topic coherence and alignment while reducing reliance on bilingual dictionaries and expensive LLM calls.

Minh Chu Xuan, Tien-Phat Nguyen, Linh Ngo Van, Dinh Viet Sang, Nguyen Thi Ngoc Diep, Trung Le• 2026

Related benchmarks

TaskDatasetResultRank
Topic ModelingEC News
CNPMI (Coherence)0.088
18
Document ClassificationAmazon Review EN
Accuracy80.03
16
Cross-lingual Topic ModelingAmazon Review
CNPMI0.072
10
Cross-lingual Topic ModelingRakuten Amazon
CNPMI0.04
10
Document ClassificationEC News EN
Accuracy79.75
8
Document ClassificationEC News ZH
Accuracy77.85
8
Document ClassificationAmazon Review ZH
Accuracy73.21
8
Document ClassificationRakuten Amazon JA
Accuracy83.34
8
Topic ModelingAiriti Thesis
CNPMI0.0531
8
Showing 9 of 9 rows

Other info

Follow for update