Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CaseHOLD

Benchmarks

Task NameDataset NameSOTA ResultTrend
Legal ReasoningCaseHOLD (test)
Test Accuracy89.22
22
Legal ReasoningCaseHold
Accuracy (CaseHold)83.13
16
Case holding classificationCaseHOLD (test)
Mean macro F178.5
12
Question AnsweringCaseHOLD
AR (%)100
9
Legal ReasoningCaseHold
Cumulative Score (CS)96
8
Question AnsweringCaseHOLD (eval)
Risk7.8
3
Showing 6 of 6 rows