Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SaulLM-7B: A pioneering Large Language Model for Law

About

In this paper, we introduce SaulLM-7B, a large language model (LLM) tailored for the legal domain. With 7 billion parameters, SaulLM-7B is the first LLM designed explicitly for legal text comprehension and generation. Leveraging the Mistral 7B architecture as its foundation, SaulLM-7B is trained on an English legal corpus of over 30 billion tokens. SaulLM-7B exhibits state-of-the-art proficiency in understanding and processing legal documents. Additionally, we present a novel instructional fine-tuning method that leverages legal datasets to further enhance SaulLM-7B's performance in legal tasks. SaulLM-7B is released under the MIT License.

Pierre Colombo, Telmo Pessoa Pires, Malik Boudiaf, Dominic Culver, Rui Melo, Caio Corro, Andre F. T. Martins, Fabrizio Esposito, Vera L\'ucia Raposo, Sofia Morgado, Michael Desa• 2024

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval--
1043
ReasoningMMLU-Pro
Accuracy27.57
241
Question AnsweringMedMCQA
Accuracy41.5
98
ReasoningGPQA
Accuracy30.3
88
Medical ReasoningMedMCQA
Accuracy41.5
58
ReasoningMMLU
Accuracy55.86
54
Language UnderstandingMMLU stratified sampling 50 samples per category
Accuracy55.86
14
Language UnderstandingMMLU-Pro stratified sampling: 150 samples per category
Accuracy27.57
14
Legal Inquisitive DialogueU.S. Supreme Court Oral Argument dataset
CS Score4.01
7
Dialogue GenerationJudicial Dialogue Human Evaluation (test)
CS3.73
6
Showing 10 of 12 rows

Other info

Follow for update