Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Veracity Assessment on FactCheck-Bench

91.3Macro-F1

GPT-4.1

52.50862.57972.6582.721Jan 10, 2026Jan 13, 2026Jan 16, 2026Jan 19, 2026Jan 22, 2026Jan 25, 2026Jan 29, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
91.3--
2026.01
90.4--
2026.01
89--
2026.01
88--
2026.01
87.9--
2026.01
87.9--
2026.01
86.9--
2026.01
86.9--
2026.01
86.8--
2026.01
86--
2026.01
83.7--
2026.01
83.7--
2026.01
78.4--
2026.01
778965
2026.01
778866
2026.01
768566
2026.01
758961
2026.01
748365
2026.01
748465
2026.01
748762
2026.01
738264
2026.01
728460
2026.01
69.8--
2026.01
678153
2026.01
627152
2026.01
546245