Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact Verification on X-Fact In-Domain (ID)
Loading...
67
Macro-F1
SEEK
31.64
40.82
50
59.18
May 26, 2026
Macro-F1
Updated 7d ago
Evaluation Results
Method
Method
Links
Macro-F1
SEEK
Model=LLaMA
2026.05
67
SEEK
Model=Mistral
2026.05
66
Sentence Chunking
Model=LLaMA
2026.05
65
SEEK
Model=Gemma
2026.05
64
Sentence Chunking
Model=Mistral
2026.05
63
Semantic Chunking
Model=LLaMA
2026.05
63
Sentence Chunking
Model=Gemma
2026.05
61
Semantic Chunking
Model=Gemma
2026.05
61
Semantic Chunking
Model=Mistral
2026.05
60
Search Snippets
Model=Mistral
2026.05
46
Search Snippets
Model=Gemma
2026.05
43
CONCRETE
Model=Mistral
2026.05
41
Search Snippets
Model=LLaMA
2026.05
40
CONCRETE
Model=Gemma
2026.05
35
CONCRETE
Model=LLaMA
2026.05
33
Feedback
Search any
task
Search any
task