Entropy-Aware Structural Alignment for Zero-Shot Handwritten Chinese Character Recognition
About
Zero-shot Handwritten Chinese Character Recognition (HCCR) aims to recognize unseen characters by leveraging radical-based semantic compositions. However, existing approaches often treat characters as flat radical sequences, neglecting the hierarchical topology and the uneven information density of different components. To address these limitations, we propose an Entropy-Aware Structural Alignment Network that bridges the visual-semantic gap through information-theoretic modeling. First, we introduce an Information Entropy Prior to dynamically modulate positional embeddings via multiplicative interaction, acting as a saliency detector that prioritizes discriminative roots over ubiquitous components. Second, we construct a Dual-View Radical Tree to extract multi-granularity structural features, which are integrated via an adaptive Sigmoid-based gating network to encode both global layout and local spatial roles. Finally, a Top-K Semantic Feature Fusion mechanism is devised to augment the decoding process by utilizing the centroid of semantic neighbors, effectively rectifying visual ambiguities through feature-level consensus. Extensive experiments demonstrate that our method establishes new state-of-the-art performance, achieving an accuracy of 55.04\% on the ICDAR 2013 dataset ($m=1500$), significantly outperforming existing CLIP-based baselines in the challenging zero-shot setting. Furthermore, the framework exhibits exceptional data efficiency, demonstrating rapid adaptability with minimal support samples, achieving 92.41\% accuracy with only one support sample per class.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Zero-shot Chinese Character Recognition | ICDAR Unseen characters 2013 (test) | Top-1 Acc (m=500)24.54 | 12 | |
| Character Recognition | CASIA-HWDB Full-Set standard (test) | Accuracy97.3 | 10 | |
| Zero-shot Chinese Radical Recognition | ICDAR Unseen radicals 2013 (test) | Top-1 Acc (k=50)24.61 | 8 | |
| Character Recognition | Character Recognition Efficiency Benchmark RTX 4090 (test) | Speed (ms)0.74 | 8 |