Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ToolFG: Towards Well-Grounded Fine-Grained Image Classification

About

Fine-grained image classification (FGIC) has broad applications and has attracted significant research attention. In this paper, we explore a novel paradigm for solving FGIC by proposing \textbf{ToolFG}, the first tool-integrated MLLM-based framework tailored to FGIC. ToolFG enables MLLMs to autonomously and flexibly use external tools during the reasoning process, actively interact with images, and collect verifiable visual cues for distinguishing highly similar categories in a more \textit{reliable} and \textit{well-grounded} manner. To equip the model with such tool-use ability, we design a novel \textbf{MCTS-guided tool-use knowledge distillation mechanism}, which effectively mines tool-use- and FGIC-relevant knowledge from advanced proprietary MLLMs for model training. Furthermore, we propose a \textbf{model-tool co-evolution mechanism} that jointly refines the toolset and the model's tool-use policy, driving them toward a mutually adapted and FGIC-specialized state. Extensive experiments demonstrate the effectiveness of our framework.

Yu Xue, Haoxuan Qu, Zhuoling Li, Yihang Lou, Yan Bai, Hossein Rahmani, Jun Liu• 2026

Related benchmarks

TaskDatasetResultRank
Fine-grained Image ClassificationCUB-200
Accuracy (All)83
39
Fine-grained Image ClassificationOxford Flowers 102
Accuracy95.8
33
Fine-grained Image ClassificationStanford Cars
Base Accuracy90
27
Fine-grained visual classificationOxford-IIIT Pet (test)--
10
Fine-grained Image ClassificationOxford Pets-37
Accuracy97.3
7
Fine-grained Image ClassificationStanford Cars 196
Accuracy83.4
7
Fine-grained Image ClassificationFGVC Aircraft 100
Accuracy76.6
7
Fine-grained Image ClassificationOxford Flowers-102 (test)
Base Accuracy (B)99.3
7
Fine-grained Image ClassificationFGVC-Aircraft (test)
Base Accuracy75.6
7
Fine-grained Image ClassificationAverage (5 datasets) Macro-average (test)
Base Accuracy89.3
7
Showing 10 of 14 rows

Other info

Follow for update