Share your thoughts, 1 month free Claude Pro on usSee more

Natural Language Inference on ContractNLI

84.5Macro-F1

Full DRO

Updated 2mo ago

Evaluation Results

Method	Links
Full DRO 2025.06		84.5
R3 2025.06		80.8
R3 2025.06		80.6
Avg Prob (RLPR) 2025.06		78.2
Rubric (RLER) 2025.06		75.4
Avg Logprob (VeriFree) 2025.06		73.6
RL-F1 2025.06		73.4
Base 2025.06		68.1