Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Alignment Evaluation on TruthfulQA
Loading...
188
A Wins
Relaxed FPO (R̃_FPO)
138.08
151.04
164
176.96
May 5, 2026
A Wins
B Wins
Ties
Total Comparisons
Win Rate (A)
p-value
Updated 27d ago
Evaluation Results
Method
Method
Links
A Wins
B Wins
Ties
Total Comparisons
Win Rate (A)
p-value
Relaxed FPO (R̃_FPO)
Comparison Baseline=St...
2026.05
188
144
485
817
56.6
0.014
Relaxed FPO (R̃_FPO)
Comparison Baseline=Pr...
2026.05
167
138
512
817
54.8
0.076
Practical FPO (R̄_FPO)
Comparison Baseline=St...
2026.05
140
135
542
817
50.9
0.41
Feedback
Search any
task
Search any
task