| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Logical Reasoning | LSAT HELM | Balanced Accuracy24.38 | 17 | |
| Logical Reasoning Question Answering | LSAT | Pass@10.29 | 11 | |
| Item Response Theory Assessment | LSAT | AUC70.7 | 9 | |
| Logical Reasoning and Reading Comprehension | LSAT PT 150–159 | LR Accuracy99.1 | 8 | |
| Logical Reasoning and Reading Comprehension | LSAT Official (test N=77) | LR Accuracy100 | 8 | |
| Human Proficiency Exam | LSAT | Accuracy81.1 | 7 | |
| Question Answering | LSAT (OOD) | Accuracy26.58 | 5 | |
| Logical Reasoning | LSAT | Accuracy37.4 | 5 | |
| Query Routing | LSAT In-Distribution (test) | CPT (90%)80.14 | 4 | |
| Query Routing | LSAT OOD | CPT 85%70.54 | 4 | |
| Query Routing | LSAT | CPT (95%)90.01 | 4 | |
| Query Routing | LSAT | CPT (90%)80.07 | 4 | |
| Model Routing | LSAT (ID) | CPT (80%)60.9 | 4 | |
| Model Routing | LSAT ID queries | CPT (85%)70.35 | 4 | |
| Model Routing | LSAT | CPT (95%)90 | 4 | |
| Query Routing | LSAT | Hypervolume91.75 | 4 | |
| Query Routing | LSAT OOD | CPT Score (80%)61 | 4 | |
| Legal Reasoning | LSAT (test) | Hypervolume0.9188 | 4 | |
| Law School Admission Testing | LSAT | Score163 | 3 | |
| Preference Learning | LSAT (test) | AUC0.707 | 2 |