Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SequencePAR: Understanding Pedestrian Attributes via A Sequence Generation Paradigm

About

Current pedestrian attribute recognition (PAR) algorithms use multi-label or multi-task learning frameworks with specific classification heads. These models often struggle with imbalanced data and noisy samples. Inspired by the success of generative models, we propose Sequence Pedestrian Attribute Recognition (SequencePAR), a novel sequence generation paradigm for PAR. SequencePAR extracts pedestrian features using a language-image pre-trained model and embeds the attribute set into query tokens guided by text prompts. A Transformer decoder generates human attributes by integrating visual features and attribute query tokens. The masked multi-head attention layer in the decoder prevents the model from predicting the next attribute during training. The extensive experiments on multiple PAR datasets validate the effectiveness of SequencePAR. Specifically, we achieve 84.92\%, 90.44\%, 90.73\%, and 90.46\% in accuracy, precision, recall, and F1-score on the PETA dataset. The source code and pre-trained models are available at https://github.com/Event-AHU/OpenPAR.

Jiandong Jin, Xiao Wang, Yin Lin, Chenglong Li, Lili Huang, Aihua Zheng, Jin Tang• 2023

Related benchmarks

TaskDatasetResultRank
Pedestrian Attribute RecognitionEventPAR (test)
mA86.27
40
Pedestrian Attribute RecognitionMSP60K
mA71.88
19
Showing 2 of 2 rows

Other info

Follow for update