Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Fault-recognition on FaultyScience
Loading...
67.8
Accuracy
DeIllusionLLM
7.896
23.448
39
54.552
Mar 23, 2026
Accuracy
Updated 25d ago
Evaluation Results
Method
Method
Links
Accuracy
DeIllusionLLM
Distillation source=Qw...
2026.03
67.8
DeIllusionLLM
Distillation source=GP...
2026.03
42.7
GPT-4
Prompting protocol=nat...
2026.03
34.9
Llama3.3 70B
Prompting protocol=nat...
2026.03
14
Qwen2.5-72B
Prompting protocol=nat...
2026.03
10.8
Mixtral-8x7B
Prompting protocol=nat...
2026.03
10.2
Feedback
Search any
task
Search any
task