Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Long-form generation factuality and uncertainty estimation on LongFact (test)

91.5Factuality Score

LOGU-DPO

84.7486.49588.2590.005Oct 18, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.10
91.547.52.36
2024.10
91.354.62.09
2024.10
89.330.42.67
2024.10
88.643.53.21
2024.10
88.529.91.85
2024.10
88.340.64.01
2024.10
87.544.84.53
2024.10
86.742.24.78
2024.10
86.2-4.55
2024.10
86.242.44.61
2024.10
86.135.15.08
2024.10
85.632.45.58
2024.10
85.5-7.45
2024.10
8527.26.38