Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Learning Concise and Descriptive Attributes for Visual Recognition

About

Recent advances in foundation models present new opportunities for interpretable visual recognition -- one can first query Large Language Models (LLMs) to obtain a set of attributes that describe each class, then apply vision-language models to classify images via these attributes. Pioneering work shows that querying thousands of attributes can achieve performance competitive with image features. However, our further investigation on 8 datasets reveals that LLM-generated attributes in a large quantity perform almost the same as random words. This surprising finding suggests that significant noise may be present in these attributes. We hypothesize that there exist subsets of attributes that can maintain the classification performance with much smaller sizes, and propose a novel learning-to-search method to discover those concise sets of attributes. As a result, on the CUB dataset, our method achieves performance close to that of massive LLM-generated attributes (e.g., 10k attributes for CUB), yet using only 32 attributes in total to distinguish 200 bird species. Furthermore, our new paradigm demonstrates several additional benefits: higher interpretability and interactivity for humans, and the ability to summarize knowledge for a recognition task.

An Yan, Yu Wang, Yiwu Zhong, Chengyu Dong, Zexue He, Yujie Lu, William Wang, Jingbo Shang, Julian McAuley• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationFood-101
Accuracy92.2
494
Image ClassificationFlowers102
Accuracy94.6
478
Image ClassificationFood101
Accuracy81.1
309
Image ClassificationRESISC45--
263
Image ClassificationCUB-200 2011
Accuracy79.3
257
Image ClassificationOxford Flowers 102--
172
Image ClassificationImageNet
Acc84.7
45
Medical Image ClassificationHAM10000
Accuracy66.8
39
Image ClassificationFGVC Aircraft
Accuracy54.7
32
Action RecognitionUCF-101
Accuracy (ACC)89.3
10
Showing 10 of 18 rows

Other info

Follow for update