Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact Verification on X-Fact Out-of-Domain (OOD)
Loading...
41
Macro-F1
SEEK
18.12
24.06
30
35.94
May 26, 2026
Macro-F1
Updated 7d ago
Evaluation Results
Method
Method
Links
Macro-F1
SEEK
Model=LLaMA
2026.05
41
SEEK
Model=Gemma
2026.05
39
SEEK
Model=Mistral
2026.05
39
Sentence Chunking
Model=Mistral
2026.05
37
Semantic Chunking
Model=Gemma
2026.05
37
Semantic Chunking
Model=Mistral
2026.05
37
Sentence Chunking
Model=Gemma
2026.05
35
Semantic Chunking
Model=LLaMA
2026.05
33
Sentence Chunking
Model=LLaMA
2026.05
31
Search Snippets
Model=Mistral
2026.05
30
Search Snippets
Model=Gemma
2026.05
28
Search Snippets
Model=LLaMA
2026.05
26
CONCRETE
Model=Mistral
2026.05
22
CONCRETE
Model=Gemma
2026.05
21
CONCRETE
Model=LLaMA
2026.05
19
Feedback
Search any
task
Search any
task