Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
CoT Faithfulness Detection on Truthful-QA
Loading...
78
Accuracy
CIE-SCORER
35.048
46.199
57.35
68.501
May 25, 2026
Accuracy
F1 Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
CIE-SCORER
Paradigm=Circuit-based
2026.05
78
71.5
CRV
Paradigm=Circuit-based
2026.05
68
66.7
Paraphrasing
Paradigm=Counterfactua...
2026.05
67.8
49.1
BiGGen
Paradigm=LLM-as-judge
2026.05
61.1
67.3
Adding Mistakes
Paradigm=Counterfactua...
2026.05
51.1
60.7
Option Shuffling
Paradigm=Counterfactua...
2026.05
51.1
59.3
Information Gain
Paradigm=Logits-based
2026.05
47.8
40.5
Perplexity
Paradigm=Baselines
2026.05
44.4
40.5
Random
Paradigm=Baselines
2026.05
43.3
42.7
Early Answering
Paradigm=Counterfactua...
2026.05
40
52.6
Answer Tracing
Paradigm=Logits-based
2026.05
38.9
50.5
Removing Steps
Paradigm=Counterfactua...
2026.05
36.7
50.4
Feedback
Search any
task
Search any
task