Share your thoughts, 1 month free Claude Pro on us
See more
Backed by Y Combinator
Discover SOTA papers with code
SOTA by domains
LLM
Computer Vision
Speech & Audio
Image Generation
View all
SOTA by tasks
Image Classification
Semantic Segmentation
Object Detection
Question Answering
View all
SOTA by datasets
ImageNet
ADE20K
COCO
Kinetics
View all
Trending benchmarks
rFVD: 8.6
Video Reconstruction on UCF-101
Updated 1d ago
mIoU: 68.64
Semantic Segmentation on ADE20K
Updated 1d ago
Delta Threshold Accuracy (1.25): 73.5
Video Depth Estimation on Sintel
Updated 1d ago
Delta 1 Acc: 98.5
Monocular Depth Estimation on NYU v2
Updated 1d ago
ATE: 0.061
Camera Pose Estimation on Sintel
Updated 1d ago
Overall Score: 75
Video Understanding on Video-MME without subtitles
Updated 1d ago
M-Avg: 84.3
Long Video Understanding on MLVU
Updated 1d ago
Accuracy: 100.3
Video Understanding on MVBench (test)
Updated 1d ago
Accuracy: 84.5
Video Question Answering on NExT-QA Multi-choice
Updated 1d ago
Abs Rel: 0.049
Monocular Depth Estimation on NYU v2 (test)
Updated 1d ago
Accuracy: 96.21
Mathematical Reasoning on GSM8k (Accuracy)
Updated 1d ago
Accuracy: 85
Mathematical Reasoning on Countdown
Updated 1d ago
Overall Accuracy: 90.5
Point Tracking on TAP-Vid Kinetics
Updated 1d ago
Accuracy: 94.42
Object Hallucination Evaluation on POPE
Updated 1d ago
Pass@1: 100
Code Generation on HumanEval (test)
Updated 1d ago
Pass@1: 95.1
Code Generation on MBPP (test)
Updated 1d ago
Accuracy: 80.2
Mathematical Reasoning on MathVista mini (test)
Updated 1d ago
Accuracy: 80.2
Visual Question Answering on RealworldQA
Updated 1d ago
Accuracy: 81.9
Visual Question Answering on GQA
Updated 1d ago
Accuracy: 84.2
Multi-discipline Multimodal Understanding on MMMU
Updated 1d ago