Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DDXPlus

Benchmarks

Task NameDataset NameSOTA ResultTrend
Medical Question AnsweringDDXPlus
Accuracy86.5
43
Medical ReasoningDDXPlus
Accuracy (DDXPlus)83.37
36
Open-set diagnostic namingDDXPlus
Accuracy48.5
15
Medical DiagnosisDDxPlus
Similarity96.6
12
Medical DiagnosisDDXPlus n=50
Top-1 Accuracy78
12
Medical ReasoningDDXPlus
Token Cost2,383
11
Medical ReasoningDDXPlus
Performance Score90.2
11
Automated Medical DiagnosisDDXPlus (test)
IL25.75
9
Medical DiagnosisDDXPlus original (test)
Similarity Score0.953
8
Medical Differential DiagnosesDDXPlus
Avg Correct79
8
Privacy RewritingDDXPlus Pri
Accuracy87.6
7
Confidence EstimationDDXPlus
AUROC0.795
7
CalibrationDDXPlus
Top-1 ECE0.01
4
ClassificationDDXPlus
Accuracy50.1
4
Synthetic Data UtilityDDXPlus
Overall Score99.9
3
Privacy EvaluationDDXPlus
Overall Score100
3
Synthetic Data DetectionDDXPlus
Overall Score37.7
3
Tabular Data SynthesisDDXPlus
Overall Score97.2
3
Showing 18 of 18 rows