Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Hallucination on CAA
Loading...
88
Accuracy (pair)
COLD-FD
60.96
67.98
75
82.02
Mar 6, 2026
Accuracy (pair)
Accuracy (pos)
Updated 2mo ago
Evaluation Results
Method
Method
Links
Accuracy (pair)
Accuracy (pos)
COLD-FD
Backbone Model=Mistral...
2026.03
88
78
DiffMean
Backbone Model=Mistral...
2026.03
80
-
ReFT(vector)
Backbone Model=Mistral...
2026.03
80
80
COLD-FD
Backbone Model=Gemma-2-9B
2026.03
70
74
Base
Backbone Model=Gemma-2-9B
2026.03
64
64
DiffMean
Backbone Model=Gemma-2-9B
2026.03
64
-
ReFT(vector)
Backbone Model=Gemma-2-9B
2026.03
64
64
Base
Backbone Model=Mistral...
2026.03
62
62
Feedback
Search any
task
Search any
task