Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Behavior Generation on BiPO
Loading...
1.59
Hallucination Score
Base
1.4988
2.1144
2.73
3.3456
Mar 6, 2026
Hallucination Score
Power Dynamics Score
Wealth Distribution Score
Updated 2mo ago
Evaluation Results
Method
Method
Links
Hallucination Score
Power Dynamics Score
Wealth Distribution Score
Base
Backbone=Llama-2-7b-ch...
2026.03
1.59
2
2.48
COLD-Kernel
Backbone=Llama-2-7b-ch...
2026.03
1.62
2.02
2.48
ReFT(vector)
Backbone=Llama-2-7b-ch...
2026.03
1.63
2
2.42
DiffMean
Backbone=Llama-2-7b-ch...
2026.03
1.71
2.22
2.58
COLD-FD
Backbone=Llama-2-7b-ch...
2026.03
3.87
2.15
2.6
Feedback
Search any
task
Search any
task