Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Visual Question Answering and Reasoning on CheXbench
Loading...
76.5
Rad-Restruct Acc
RadAgents
52.996
59.098
65.2
71.302
Sep 24, 2025
Rad-Restruct Acc
SLAKE Acc
OpenI Reasoning Acc
Overall Acc
Updated 3d ago
Evaluation Results
Method
Method
Links
Rad-Restruct Acc
SLAKE Acc
OpenI Reasoning Acc
Overall Acc
RadAgents
Model / Backbone=RadAg...
2025.09
76.5
89.4
69.2
74.6
Qwen3-VL w/ Workflow
Model / Backbone=Qwen3...
2025.09
72.2
87.8
65.3
71
RadAgents wo/ V-RAG
Model / Backbone=RadAg...
2025.09
71.3
87.8
66.3
71.5
Qwen3-VL w/ ReAct
Model / Backbone=Qwen3...
2025.09
70.4
86.2
61.3
68
CheXagent
Model / Backbone=CheXa...
2025.09
57.1
78.1
59
64.7
GPT-4o
Model / Backbone=GPT-4o
2025.09
53.9
85.4
51.1
63.5
Feedback
Search any
task
Search any
task