Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact-checking on AVeriTeC (dev)
Loading...
0.46
Hu-METEOR (Q only)
GPT-4o
0.2312
0.2906
0.35
0.4094
Oct 15, 2024
Hu-METEOR (Q only)
Hu-METEOR (Q+A)
AVeriTeC Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Hu-METEOR (Q only)
Hu-METEOR (Q+A)
AVeriTeC Score
GPT-4o
Pipeline configuration...
2024.10
0.46
0.29
0.42
Llama 3.1 70B
Pipeline configuration...
2024.10
0.46
0.27
0.36
GPT-4o
Pipeline configuration...
2024.10
0.45
0.28
0.38
GPT-4o
Pipeline configuration...
2024.10
0.45
0.28
0.36
Claude-3.5-Sonnet
Pipeline configuration...
2024.10
0.43
0.28
0.35
AVeriTeC baseline
2024.10
0.24
0.19
0.09
Feedback
Search any
task
Search any
task