Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fact-checking on CNN
Loading...
62.1
Balanced Accuracy
ANCHOR
45.564
49.857
54.15
58.443
May 11, 2026
Balanced Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
Balanced Accuracy
ANCHOR
Backbone=Qwen2.5-72B,...
2026.05
62.1
Vanilla
Backbone=Qwen2.5-72B,...
2026.05
60.4
CoT
Backbone=Qwen2.5-72B,...
2026.05
58.9
BIRD
Backbone=Qwen2.5-72B,...
2026.05
54
Vanilla
Backbone=DeepSeek-V3-6...
2026.05
50
Vanilla
Backbone=Qwen2.5-72B,...
2026.05
50
CoT
Backbone=Qwen2.5-72B,...
2026.05
50
Vanilla
Backbone=DeepSeek-V3-6...
2026.05
49.5
CoT
Backbone=DeepSeek-V3-6...
2026.05
49.5
CoT
Backbone=DeepSeek-V3-6...
2026.05
46.2
Feedback
Search any
task
Search any
task