Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact Verification on FEVER (Accuracy, ECE)
Loading...
99.7
Accuracy
LPF-SPN
41.772
56.811
71.85
86.889
Mar 13, 2026
Accuracy
ECE
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
ECE
LPF-SPN
Runtime (ms)=25.2, Spe...
2026.03
99.7
1.2
LPF-Learned
Runtime (ms)=24.0, Spe...
2026.03
99.7
0.3
VAE-Only
Runtime (ms)=3.5, Spee...
2026.03
99.7
0.3
Qwen3-32B
Runtime (ms)=3176.4, S...
2026.03
62
82.3
Kimi-K2
Runtime (ms)=609.5, Sp...
2026.03
56
87.3
GPT-OSS-120B
Runtime (ms)=1718.2, S...
2026.03
54
86.8
Llama-3.3-70B
Runtime (ms)=1581.6, S...
2026.03
44
74.4
Feedback
Search any
task
Search any
task