Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact-checking on HOVER 2-hop (test)
Loading...
75.13
Macro F1
Trification
49.702
56.3035
62.905
69.5065
Nov 29, 2025
Macro F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro F1
Trification
Model Category=Our, Me...
2025.11
75.13
PACAR
Model Category=III
2025.11
73.13
LoCaL
Model Category=IV
2025.11
72.71
Trification
Model Category=Our, Me...
2025.11
71.22
ProgramFC
Model Category=III, Nu...
2025.11
70.3
ProgramFC
Model Category=III, Nu...
2025.11
69.36
Flan-T5
Model Category=II
2025.11
69.02
DeBERTaV3-NLI
Model Category=I
2025.11
68.72
ChatGPT
Model Category=II
2025.11
66.94
Trification
Model Category=Our, Me...
2025.11
65.64
Codex
Model Category=II
2025.11
65.07
SearChain
Model Category=IV
2025.11
64.46
RoBERTa-NLI
Model Category=I
2025.11
63.62
MUTIVERS
Model Category=I
2025.11
60.17
LisT5
Model Category=I
2025.11
52.56
BERT-FC
Model Category=I
2025.11
50.68
Feedback
Search any
task
Search any
task