Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination Evaluation on ObjHal
Loading...
47.3
CRs Accuracy
Muffin-13B
27.956
32.978
38
43.022
Jul 29, 2025
CRs Accuracy
Response Accuracy
Mention Accuracy
CRi Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
CRs Accuracy
Response Accuracy
Mention Accuracy
CRi Accuracy
Muffin-13B
Backbone=Muffin-13B
2025.07
47.3
-
-
15.2
RLHF
Backbone=Muffin-13B
2025.07
45.5
-
-
12.7
DPO
Backbone=Muffin-13B
2025.07
43.8
-
-
13.9
CHiP-DPO
Backbone=Muffin-13B
2025.07
35.2
-
-
11.5
TARS (Replace)
Backbone=Muffin-13B
2025.07
29.3
-
-
8.8
TARS (Mask)
Backbone=Muffin-13B
2025.07
28.7
-
-
8.2
LLaVA
2025.04
-
63
29.5
-
RLHF-V
2025.04
-
12.2
7.5
-
POPEN
2025.04
-
9.2
4.9
-
Feedback
Search any
task
Search any
task