| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Medical Question Answering | MedMCQA | Accuracy89.02 | 253 | |
| Medical Question Answering | MedMCQA (test) | Accuracy84.13 | 134 | |
| Question Answering | MedMCQA (test) | Test Error Rate0.163 | 48 | |
| Medical Knowledge Editing | MedMCQA edit | Efficacy51 | 18 | |
| Machine Unlearning | MedMCQA QF=1000 | Forget Accuracy90 | 14 | |
| LLM Routing | MEDMCQA (val) | Top-1 Acc96.3 | 14 | |
| LLM Routing | MedMCQA | Top-1 Acc96.3 | 14 | |
| Clinical Question Answering | MedMCQA | Accuracy86.1 | 14 | |
| Question Answering | MedMCQA (dev) | Accuracy0.791 | 11 | |
| Biomedical Question Answering | MedMCQA In-Domain (test) | Accuracy90 | 10 | |
| Question Answering | MedMCQA | FDR (%)6.43 | 9 | |
| Medical Question Answering | MedMCQA translated (test) | Accuracy (ZH)43.2 | 9 | |
| Question Answering | MedMCQA (val) | Accuracy90 | 7 | |
| Multiple-choice Question Answering | MedMCQA | Accuracy0.5785 | 6 | |
| Medical Question Answering | MedMCQA subset | AUROC0.6448 | 6 | |
| Uncertainty Quantification | MedMCQA (test) | AUROC71.7 | 6 | |
| Multiple-choice question answering | MedMCQA (test) | Accuracy73.7 | 6 | |
| Spoken Medical Question Answering | MedMCQA (test) | QA Accuracy92.5 | 5 | |
| Medical Question Answering | MedMCQA (val) | Accuracy49.64 | 4 | |
| ASR Error Correction | MedMCQA (test) | WER28.1 | 3 | |
| Answer Correctness Prediction | MedMCQA (val) | Head Entropy0.79 | 1 |