Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact-checking on Causal and Downstream Robustness Ablation Suite Averaged over 4 models
Loading...
3.7
Fact EMΔ
HETA
0.372
1.236
2.1
2.964
Apr 14, 2026
Fact EMΔ
Updated 2d ago
Evaluation Results
Method
Method
Links
Fact EMΔ
HETA
Method Variant=Full
2026.04
3.7
HETA
Method Variant=LR+WIN
2026.04
3.3
HETA
Method Variant=w/o Hes...
2026.04
2.5
HETA
Method Variant=w/o KL
2026.04
2
ReAGent
Method Variant=Standard
2026.04
2
SEA-CoT
Method Variant=Standard
2026.04
1.7
HETA
Method Variant=w/o Tra...
2026.04
1.6
Progressive Inference
Method Variant=Standard
2026.04
1.5
fAML
Method Variant=Standard
2026.04
1.3
ContextCite
Method Variant=Standard
2026.04
1.2
TDD-backward
Method Variant=Standard
2026.04
1.1
Peering (PML)
Method Variant=Standard
2026.04
1
Integrated Gradients
Method Variant=Standard
2026.04
0.9
Attention Rollout
Method Variant=Standard
2026.04
0.5
Feedback
Search any
task
Search any
task