| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| CaseHOLD (test) | LLM2LLM | Test Accuracy88.14 | 17 | 4d ago | |
| LexEval | Kimi-K2-thinking | Memoization63.78 | 15 | 4d ago | |
| LegalBench Hearsay | MAD | Accuracy77.6 | 10 | 4d ago | |
| LegalBench Learned Hands Courts | MAD | Accuracy75.5 | 10 | 4d ago | |
| LegalBench | Llama3.1-70B | Balanced Accuracy79.3 | 10 | 4d ago | |
| LawBench | MCE | Micro-F170 | 10 | 4d ago | |
| LegalBench CUAD Cardlytics Buffalo Wild Wings PF Hospitality 2023 | Agentic Adversarial QA | Accuracy (Cardl)82.7 | 6 | 4d ago |