Law

Benchmarks

Task Name	Dataset Name	SOTA Result
Law reasoning	Law	Accuracy70.25	27
Speculative Decoding	Law	Throughput (tokens/s)132.69	22
Counterfactual Explanations	Law	Validity100	18
Legal Reasoning	Law	LLM-as-judge Score34.4	13
Legal Reasoning	Law	Score26.52	13
Machine Translation	Law (test)	BLEU61.55	9
Multiple Choice Question Answering	Law	Accuracy45.4	8
Multi-class Classification	Law	Accuracy69.1	8
Machine Translation	Law (Ko-En) (test)	BLEU53.8	8
Machine Translation	Law De-En (test)	BLEU75.44	8
Machine Translation	Law All-domain datastore (test)	BLEU61.22	6
Legal Reasoning	Law (test)	Score45.29	5
Machine Translation	Law En-De out-of-domain WMT14 (test)	BLEU Score41.5	5
Symmetric Robustness Verification	Law	Verification Time (s)1.97	4
Asymmetric Robustness Verification	Law	Verification Time (s)1.34	4
Within-distribution detection	Law Unsafe vs. Safe v3/v4 (holdout)	AUROC0.971	2
Machine Translation	Law multi-domain (test)	Decoding Speed (Tok/Sec)3,690.85	2

Showing 17 of 17 rows