Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Misinformation Belief Evaluation on MISBELIEF hard misinformation with third round evidence
Loading...
4.05
Performance
Qwen-turbo
3.3636
3.5418
3.72
3.8982
Jan 9, 2026
Performance
Rank
Updated 1mo ago
Evaluation Results
Method
Method
Links
Performance
Rank
Qwen-turbo
Temperature=0, Access=...
2026.01
4.05
5
GPT-5
Temperature=0, Access=...
2026.01
3.91
4
Qwen2.5-72B
Temperature=0, Access=...
2026.01
3.85
3
Qwen2.5-32B
Temperature=0, Access=...
2026.01
3.69
2
Llama3-8B
Temperature=0, Hardwar...
2026.01
3.39
1
Feedback
Search any
task
Search any
task