Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact-checking on HOVER 3-hop (test)
Loading...
66.42
Macro F1
Trification
49.1976
53.6688
58.14
62.6112
Nov 29, 2025
Macro F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro F1
Trification
Model Category=Our, Me...
2025.11
66.42
LoCaL
Model Category=IV
2025.11
64.11
PACAR
Model Category=III
2025.11
64.07
ProgramFC
Model Category=III, Nu...
2025.11
63.43
Trification
Model Category=Our, Me...
2025.11
61.56
DeBERTaV3-NLI
Model Category=I
2025.11
60.76
ProgramFC
Model Category=III, Nu...
2025.11
60.63
ChatGPT
Model Category=II
2025.11
60.56
SearChain
Model Category=IV
2025.11
60.3
Flan-T5
Model Category=II
2025.11
60.23
Codex
Model Category=II
2025.11
56.63
Trification
Model Category=Our, Me...
2025.11
55.1
RoBERTa-NLI
Model Category=I
2025.11
53.99
MUTIVERS
Model Category=I
2025.11
52.55
LisT5
Model Category=I
2025.11
51.86
BERT-FC
Model Category=I
2025.11
49.86
Feedback
Search any
task
Search any
task