Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Justification Quality Evaluation on AVeriTeC Retrieved (H) 50 correctly verified claims
Loading...
3.67
MOS
DebateCV
2.162
2.5535
2.945
3.3365
Jul 25, 2025
MOS
Win Rate
Tie Rate
Loss Rate
Updated 12d ago
Evaluation Results
Method
Method
Links
MOS
Win Rate
Tie Rate
Loss Rate
DebateCV
Criterion=Evidence Use...
2025.07
3.67
57.6
27.5
14.9
DebateCV
Criterion=Reasoning Pa...
2025.07
3.59
55.6
28.3
16.1
DebateCV
Criterion=Sources of U...
2025.07
3.07
50.5
32.8
16.7
HerO
Criterion=Evidence Use...
2025.07
2.6
-
-
-
HerO
Criterion=Reasoning Pa...
2025.07
2.49
-
-
-
HerO
Criterion=Sources of U...
2025.07
2.22
-
-
-
Feedback
Search any
task
Search any
task