| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| agent-CMB | Medical-CoT* | Rounds18.34 | 25 | 4d ago | |
| MedQA agent | MedKGI | Rounds9.11 | 25 | 4d ago | |
| MedEinst Robust 1.0 | ECR-Agent (Qwen3-32B) | Robust Accuracy24.21 | 18 | 4d ago | |
| MedEinst Baseline 1.0 | ECR-Agent (Qwen3-32B) | Baseline Accuracy69.49 | 18 | 4d ago | |
| COVID19-CT | SH-PEFT | F1 Score83 | 16 | 4d ago | |
| MAU (test) | UMed-LVLM | DL Score53 | 13 | 4d ago | |
| NEJM | DDO | Rounds17.91 | 9 | 4d ago | |
| MD DX | GoT | Worst Case Interaction Length10.5 | 8 | 4d ago | |
| MD DX weighted (test) | Worst-case Weighted Payoff126.4 | 8 | 4d ago | ||
| DiagnosisArena | Pass@145.57 | 4 | 4d ago |