Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact Verification on FEVER (F1)
Loading...
53.9
F1 Score
FLARE
28.6384
35.1967
41.755
48.3133
Mar 26, 2026
F1 Score
Updated 22d ago
Evaluation Results
Method
Method
Links
F1 Score
FLARE
LLM=Llama-3.1-8B, Writ...
2026.03
53.9
FLARE
LLM=Gemma-3-12B, Write...
2026.03
51.31
FLARE
LLM=Llama-3.1-8B, Writ...
2026.03
48.18
FLARE
LLM=Gemma-3-12B, Write...
2026.03
46.25
Naive RAG
LLM=Llama-3.1-8B, Writ...
2026.03
39.89
RePlug
LLM=Llama-3.1-8B, Writ...
2026.03
39.6
RePlug
LLM=Gemma-3-12B, Write...
2026.03
37.92
Naive RAG
LLM=Gemma-3-12B, Write...
2026.03
37.89
Self-RAG
LLM=Llama-3.1-8B, Writ...
2026.03
34.77
No Retrieval
LLM=Gemma-3-12B
2026.03
34.24
Naive RAG
LLM=Llama-3.1-8B, Writ...
2026.03
34.08
RePlug
LLM=Llama-3.1-8B, Writ...
2026.03
33.96
No Retrieval
LLM=Llama-3.1-8B
2026.03
33.13
Naive RAG
LLM=Gemma-3-12B, Write...
2026.03
32.77
RePlug
LLM=Gemma-3-12B, Write...
2026.03
32.73
Self-RAG
LLM=Gemma-3-12B, Write...
2026.03
32.08
Self-RAG
LLM=Llama-3.1-8B, Writ...
2026.03
31.45
Self-RAG
LLM=Gemma-3-12B, Write...
2026.03
29.61
Feedback
Search any
task
Search any
task