Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Logical Reasoning on PrOntoQA (Acc, NLDD)
Loading...
100
Accuracy
Llama-3.1-8B
98.96
99.23
99.5
99.77
Feb 4, 2026
Accuracy
NLDD
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
NLDD
Llama-3.1-8B
Regime=Faithful Regime...
2026.02
100
20.6
DeepSeek-Coder-6.7B
Regime=Faithful Regime...
2026.02
100
84.3
Gemma-2-9B
Regime=Anti-Faithful R...
2026.02
99
-52.5
Feedback
Search any
task
Search any
task