Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-context Factuality Evaluation on LongBench (Factuality Subset)

32.86Fact Count

DPO w/ LongReward

17.83221.733525.63529.5365Oct 28, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.10
32.8692.85
2024.10
28.0593.62
2024.10
21.7691.94
2024.10
18.4191.43