Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LegalMidm: Use-Case-Driven Legal Domain Specialization for Korean Large Language Model

About

In recent years, the rapid proliferation of open-source large language models (LLMs) has spurred efforts to turn general-purpose models into domain specialists. However, many domain-specialized LLMs are developed using datasets and training protocols that are not aligned with the nuanced requirements of real-world applications. In the legal domain, where precision and reliability are essential, this lack of consideration limits practical utility. In this study, we propose a systematic training framework grounded in the practical needs of the legal domain, with a focus on Korean law. We introduce LegalMidm, a Korean legal-domain LLM, and present a methodology for constructing high-quality, use-case-driven legal datasets and optimized training pipelines. Our approach emphasizes collaboration with legal professionals and rigorous data curation to ensure relevance and factual accuracy, and demonstrates effectiveness in key legal tasks.

Youngjoon Jang, Chanhee Park, Hyeonseok Moon, Young-kyoung Ham, Jiwon Moon, Jinhyeon Kim, JuKyung Jung, Heuiseok Lim• 2026

Related benchmarks

TaskDatasetResultRank
General Knowledge EvaluationHAERAE
Accuracy70.3
13
Legal Machine Reading ComprehensionLegal Task - MRC (test)
Rouge-L57.5
5
Legal Multiple Choice Question AnsweringLegal Task MC (test)
Accuracy65
5
Legal Question AnsweringLegal Task QA (test)
ROUGE-L17.74
5
Legal SummarizationLegal Task Summary (test)
ROUGE-L47.94
5
Legal text generationLegal Task Complaint (test)
ROUGE-L67.67
5
Legal text generationLegal Task Petition (test)
Rouge-L14.46
5
General Knowledge EvaluationKMMLU
Accuracy44.75
5
Showing 8 of 8 rows

Other info

Follow for update