Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fact-checking on LIAR
Loading...
79
Accuracy
IO
67.456
70.453
73.45
76.447
Jan 30, 2026
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
IO
Executor=GPT-5
2026.01
79
CoT
Executor=GPT-5
2026.01
79
UPA
Executor=GPT-5
2026.01
78.8
SPO
Executor=GPT-5
2026.01
78
UPA
Executor=Claude-4.5-So...
2026.01
74.7
SPO
Executor=Claude-4.5-So...
2026.01
74.1
CoT
Executor=Claude-4.5-So...
2026.01
73.8
IO
Executor=Claude-4.5-So...
2026.01
73.7
UPA
Executor=DeepSeek-V3.2
2026.01
68.6
CoT
Executor=DeepSeek-V3.2
2026.01
68.4
SPO
Executor=DeepSeek-V3.2
2026.01
68.2
IO
Executor=DeepSeek-V3.2
2026.01
67.9
Feedback
Search any
task
Search any
task