Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Video Action Recognition on CharadesEgo
Loading...
14
Top-1 Accuracy
DINOv3
10.256
11.228
12.2
13.172
Feb 9, 2026
Top-1 Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Top-1 Accuracy
DINOv3
Backbone=ViT-L/14, Res...
2026.02
14
DINOv3
Backbone=ViT-L/14, Res...
2026.02
13.2
OV-Encoder (Codec)
Backbone=ViT-L/14, Res...
2026.02
12.9
OV-Encoder (Frame)
Backbone=ViT-L/14, Res...
2026.02
12.6
AIMv2
Backbone=ViT-L/14, Res...
2026.02
12.4
OV-Encoder (Codec)
Backbone=ViT-L/14, Res...
2026.02
12.3
OV-Encoder (Frame)
Backbone=ViT-L/14, Res...
2026.02
12.1
AIMv2
Backbone=ViT-L/14, Res...
2026.02
12
SigLIP2
Backbone=ViT-L/16, Res...
2026.02
11.9
SigLIP
Backbone=ViT-L/16, Res...
2026.02
11.7
SigLIP2
Backbone=ViT-L/16, Res...
2026.02
11.6
MetaCLIP2
Backbone=ViT-L/14, Res...
2026.02
11.2
MetaCLIP2
Backbone=ViT-L/14, Res...
2026.02
11
SigLIP
Backbone=ViT-L/16, Res...
2026.02
10.9
CLIP
Backbone=ViT-L/14, Res...
2026.02
10.8
MetaCLIP
Backbone=ViT-L/14, Res...
2026.02
10.4
Feedback
Search any
task
Search any
task