MATANet: A Multi-context Attention and Taxonomy-Aware Network for Fine-Grained Underwater Recognition of Marine Species
About
Fine-grained recognition of marine organisms is important for ecological research, biodiversity monitoring, habitat conservation, and evidence-based policy-making. However, many existing approaches primarily rely on object- or ROI-centered representations. These limitations can reduce discriminative performance in challenging underwater scenes, where visually similar organisms often appear under diverse environmental conditions. To address these challenges, we propose MATANet (Multi-context Attention and Taxonomy-Aware Network), a framework for fine-grained taxonomic recognition of marine organisms. MATANet is motivated by expert taxonomic identification practices, in which both organism-level morphology and contextual cues are considered during recognition. The framework consists of two main components. First, the Multi-Context Environmental Attention Module (MCEAM) models cross-attention between the primary region of interest (ROI) and multi-scale surrounding environmental regions, thereby combining local morphological cues with habitat-level contextual information. Second, the Hierarchy-Aware Representation Learning Module (HRLM) uses taxonomic hierarchy as auxiliary supervision to regularize representation learning and encourage semantically structured embeddings across taxonomic levels. By jointly modeling organism appearance, environmental context, and taxonomic structure, MATANet learns more discriminative representations for fine-grained taxonomic recognition. Experiments on FathomNet2025 and LifeCLEF2015-Fish demonstrate that MATANet consistently improves recognition performance over existing methods. Additional experiments on FAIR1M further examine the applicability of the proposed framework beyond underwater imagery. Notably, MATANet ranked first in the FathomNet 2025 Challenge at the CVPR 2025 FGVC12 workshop.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hierarchical classification | FathomNet Private 2025 (test) | Hierarchical Distance (HD)1.45 | 15 | |
| Hierarchical classification | FathomNet Weighted Overall 2025 (Weighted Public Private) | Weighted Hierarchical Distance (WgtAvg)1.54 | 15 | |
| Hierarchical classification | FathomNet Public 2025 (test) | Hierarchical Distance (HD)1.62 | 15 | |
| Classification | FAIR1M domain generalization evaluation v2 | Accuracy (ACC)74 | 10 | |
| Classification | FishCLEF 2015 | Accuracy78.9 | 10 |