Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Improving Medical VQA through Trajectory-Aware Process Supervision

About

Reasoning capabilities are crucial for reliable medical visual question answering (VQA); however, existing datasets rarely include reasoning explanations. We address this by generating reasoning trajectories for six medical VQA benchmarks using the COMCTS algorithm with open-source vision-language models, with an LLM serving as the verification judge. Building on these generated datasets, we propose a two-stage training framework: supervised fine-tuning followed by Group Relative Policy Optimization (GRPO) with a novel process-based reward. While standard approaches rely solely on exact-match rewards for final answers, we introduce a trajectory-aware reward that measures the similarity between generated and ground-truth reasoning processes. Specifically, we embed reasoning steps using sentence transformers and compute the Dynamic Time Warping (DTW) distance between the resulting vector sequences. Experiments across six benchmarks demonstrate that combining the DTW-based process reward with exact-match reward consistently outperforms SFT-only training, raising mean accuracy from 0.598 to 0.689, mean BERTScore from 0.845 to 0.881, and mean ROUGE-L from 0.665 to 0.748. Our results highlight the importance of process supervision in training reasoning-capable medical VLMs. We make our code and generated reasoning datasets publicly available at https://anonymous.4open.science/r/MICCAI-R1-MED-VQA-code-B14B/

Halil Ibrahim Gulluk, Olivier Gevaert• 2026

Related benchmarks

TaskDatasetResultRank
Medical Visual Question AnsweringVQA-RAD
Accuracy67.6
228
Medical Visual Question AnsweringPMC-VQA
Accuracy64.1
103
Medical Visual Question AnsweringPathVQA
Accuracy71
80
Medical Visual Question AnsweringOmniMedVQA
Accuracy55.7
48
Medical Visual Question AnsweringVQA-Med
Accuracy73.3
9
Medical Visual Question AnsweringVQA-RAD
BLEU-10.695
7
Medical Visual Question AnsweringPath-VQA
Sentence BLEU-171.3
4
Medical Visual Question AnsweringPMC-VQA
Sentence BLEU-10.738
4
Medical Visual Question AnsweringSlake-VQA
Sentence BLEU-10.841
4
Medical Visual Question AnsweringVQA-Med
Sentence BLEU-175.8
4
Showing 10 of 38 rows

Other info

Follow for update