Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Medical Diagnosis on MD DX weighted (test)
Loading...
126.4
Worst-case Weighted Payoff
DC
71.488
85.744
100
114.256
Feb 2, 2026
Worst-case Weighted Payoff
Updated 4d ago
Evaluation Results
Method
Method
Links
Worst-case Weighted Payoff
DC
Backbone LLM=GPT 4.1
2026.02
126.4
DP
Backbone LLM=GPT 4.1
2026.02
116
DP
Backbone LLM=Qwen 2.5...
2026.02
114.3
UoT
Backbone LLM=GPT 4.1
2026.02
110
UoT
Backbone LLM=Qwen 2.5...
2026.02
99
DC
Backbone LLM=Qwen 2.5...
2026.02
86.1
GoT
Backbone LLM=GPT 4.1
2026.02
78.3
GoT
Backbone LLM=Qwen 2.5...
2026.02
73.6
Feedback
Search any
task
Search any
task