| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Object Detection | LLVIP | mAP5098.2 | 58 | |
| Object Detection | LLVIP (test) | mAP5098.3 | 38 | |
| Infrared-Visible Image Fusion | LLVIP (test) | EN7.42 | 23 | |
| Pedestrian Detection | LLVIP (test) | mAP@5097.2 | 20 | |
| Classification | LLVIP | RGB Accuracy74.1 | 18 | |
| Image-to-Image Translation | LLVIP | PSNR (dB)12.66 | 14 | |
| Infrared Image Classification | LLVIP | Top-1 Accuracy87.2 | 13 | |
| Visible-Infrared Image Fusion | LLVIP | EI14.74 | 10 | |
| Infrared-visible Image Fusion | LLVIP | EN7.36 | 8 | |
| Multi-Modal Image Fusion | LLVIP (test) | SSIM122 | 8 | |
| Text-to-Thermal Retrieval (via Vision pivot) | LLVIP to FLIR (test) | mAP37.6 | 6 | |
| Thermal-to-Text Retrieval (via Vision pivot) | LLVIP to FLIR (test) | mAP40.2 | 6 | |
| Object Detection | LLVIP (val) | mAP97.8 | 5 | |
| Image Fusion | LLVIP | EI14.74 | 4 | |
| Multi-modal Joint Retrieval | LLVIP | Top-1 Accuracy79.3 | 2 | |
| Thermal Image Classification | LLVIP (test) | Top-1 Accuracy- | 0 |