Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact-checking on FEVEROUS-S
Loading...
72.55
Macro F1
RRC
50.8348
56.4724
62.11
67.7476
Jan 23, 2026
Macro F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro F1
RRC
Method Category=LLM-ba...
2026.01
72.55
BiDev
Method Category=LLM-ba...
2026.01
68.61
Bootstrapping
Method Category=LLM-ba...
2026.01
66.12
ProgramFC
Method Category=LLM-ba...
2026.01
65.59
FOLK
Method Category=LLM-ba...
2026.01
64.07
FLAN-T5
Method Category=LLM-ba...
2026.01
63.73
Codex
Method Category=LLM-ba...
2026.01
62.58
DeBERTaV3-NLI
Method Category=Finetu...
2026.01
58.81
RoBERTa-NLI
Method Category=Finetu...
2026.01
57.8
MULTIVERS
Method Category=Finetu...
2026.01
56.61
LisT5
Method Category=Pretra...
2026.01
54.15
BERT-FC
Method Category=Pretra...
2026.01
51.67
Feedback
Search any
task
Search any
task