Vision Model + Dual Text Decoders
| Method | Links | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 90.6 | 87.7 | 84.9 | 82.8 | 58.9 | 90.5 | 6.696 | 83.2 | ||
2024.03 | 90.3 | 87 | 83.6 | 80.7 | 57.8 | 90.2 | 5.675 | 83.5 | |
2024.03 | 89.7 | 86.3 | 83 | 80.6 | 57.1 | 88.8 | 6.355 | 81.7 | |
2024.03 | 89.3 | 85.4 | 81.6 | 78.8 | 57.3 | 88.4 | 5.893 | 81.5 | |
2024.03 | 88.2 | 84.3 | 80.3 | 77.2 | 54.6 | 87.6 | 5.234 | 79.7 | |
2024.03 | 88.1 | 83.9 | 79.8 | 76.7 | 54.1 | 87.1 | 5.314 | 80.2 | |
2024.03 | 87.5 | 83.5 | 79.6 | 76.6 | 54.1 | 86.6 | 5.558 | 79.9 | |
2024.03 | 62.5 | 58.8 | 55.1 | 51.9 | 49.3 | 73.8 | 1.868 | 82.4 |