Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-segment grounding on Perception Test (val)
Loading...
31.7
F1 Score
MUSEG-7B
4.66
11.68
18.7
25.72
May 27, 2025
F1 Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
MUSEG-7B
Category=Open-source ~...
2025.05
31.7
Qwen2.5-VL-7B + vanilla GRPO
Category=Open-source ~...
2025.05
30
MUSEG-3B
Category=Open-source ~...
2025.05
29.1
VideoChat-R1
Category=Open-source ~...
2025.05
27.1
TimeZero
Category=Open-source ~...
2025.05
26.8
Qwen2.5-VL-7B
Category=Open-source ~...
2025.05
25.3
TEMPURA
Category=Open-source ~...
2025.05
20.7
Qwen2.5-VL-7B + vanilla SFT
Category=Open-source ~...
2025.05
20.3
Qwen2.5-VL-3B
Category=Open-source ~...
2025.05
19.4
TRACE-7B
Category=Open-source ~...
2025.05
14
E.T. Chat
Category=Open-source ~...
2025.05
9.2
Video-R1
Category=Open-source ~...
2025.05
5.7
Feedback
Search any
task
Search any
task