Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Object Hallucination Detection on POPE (test)
Loading...
89.4
Final Performance Score
MAI
87.008
87.629
88.25
88.871
Apr 6, 2026
Final Performance Score
MV 2B(S)/8B(S) Score
MV 2B(R)/8B(S) Score
Runtime (ms)
Updated 11d ago
Evaluation Results
Method
Method
Links
Final Performance Score
MV 2B(S)/8B(S) Score
MV 2B(R)/8B(S) Score
Runtime (ms)
MAI
Backbone=Qwen3-VL-2B/8B
2026.04
89.4
8,507
46
447
Full RT
Backbone=Qwen3-VL-8B+2B-R
2026.04
89.2
-
-
-
2B(S) Baseline
Backbone=Qwen3-VL-2B,...
2026.04
88.6
-
-
-
2B(R) Baseline
Backbone=Qwen3-VL-2B,...
2026.04
88.4
-
-
-
8B(S) Baseline
Backbone=Qwen3-VL-8B,...
2026.04
87.1
-
-
-
Feedback
Search any
task
Search any
task