Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General-domain Persuasion Robustness on FARM
Loading...
63
NQ1 Score
R-FT
23.48
33.74
44
54.26
Apr 23, 2026
NQ1 Score
NQ2 Score
TruthfulQA Score
BoolQ Score
Updated 8d ago
Evaluation Results
Method
Method
Links
NQ1 Score
NQ2 Score
TruthfulQA Score
BoolQ Score
R-FT
2026.04
63
72
86
69
RBED
2026.04
32
50
72
55
Vanilla
2026.04
25
40
50
37
Feedback
Search any
task
Search any
task