Agri-R1: Agricultural Reasoning for Disease Diagnosis via Automated-Synthesis and Reinforcement Learning

About

Agricultural disease diagnosis challenges VLMs, as conventional fine-tuning requires extensive labels, lacks interpretability, and generalizes poorly. While reasoning improves model robustness, existing methods rely on costly expert annotations and rarely address the open-ended, diverse nature of agricultural queries. To address these limitations, we propose \textbf{Agri-R1}, a reasoning-enhanced large model for agriculture. Our framework automates high-quality reasoning data generation via vision-language synthesis and LLM-based filtering, using only 19\% of available samples. Training employs Group Relative Policy Optimization (GRPO) with a novel reward function that integrates domain-specific lexicons and fuzzy matching to assess both correctness and linguistic flexibility in open-ended responses. Evaluated on CDDMBench, our resulting 3B-parameter model achieves performance competitive with 7B- to 13B-parameter baselines, showing a +27.9\% relative gain in disease recognition accuracy, +33.3\% in agricultural knowledge QA, and a +26.10-point improvement in cross-domain generalization over standard fine-tuning. These results suggest that automated reasoning synthesis paired with domain-aware reward design may provide a broadly applicable paradigm for RL-based VLM adaptation in data-scarce specialized domains. Our code and data are publicly available at: https://github.com/CPJ-Agricultural/Agri-R1.

Wentao Zhang, Mingkun Xu, Qi Zhang, Shangyang Li, Derek F. Wong, Lifei Wang, Yanchao Yang, Lina Lu, Tao Fang• 2026

Related benchmarks

Task	Dataset	Result
Crop Recognition	CDDMBench	Accuracy92.58	31
Knowledge QA	CDDMBench	QA Accuracy84	31
Disease Recognition	CDDMBench	Accuracy72.5	15

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord