Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact-checking on ExClaim (subsample)
Loading...
4.68
Explanation Correctness
CO-FACTCHECKER
4.1912
4.3181
4.445
4.5719
Apr 15, 2026
Explanation Correctness
Explanation Comprehensibility
Trace Correctness
Trace Comprehensibility
Updated 2d ago
Evaluation Results
Method
Method
Links
Explanation Correctness
Explanation Comprehensibility
Trace Correctness
Trace Comprehensibility
CO-FACTCHECKER
Type=Simulated Human-A...
2026.04
4.68
4.33
4.12
3.52
Multi-Turn Dialogue
Type=Simulated Human-A...
2026.04
4.56
4.39
3.42
2.48
Verifier
Type=Autonomous
2026.04
4.52
4.23
3.83
3.47
SAFE
Type=Autonomous
2026.04
4.45
4.11
3.99
3.52
Deep Research
Type=Autonomous
2026.04
4.45
4.19
-
-
FIRE
Type=Autonomous
2026.04
4.24
4.15
3.92
3.44
FactCheckGPT
Type=Autonomous
2026.04
4.21
4.33
4.03
3.4
Feedback
Search any
task
Search any
task