Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Truncated CoT Answering on ZebraLogicBench (ZLB)
Loading...
58.7
AOC
Faithfulness Only
46.948
49.999
53.05
56.101
Feb 18, 2026
AOC
Updated 4d ago
Evaluation Results
Method
Method
Links
AOC
Faithfulness Only
Backbone=Qwen3-14B
2026.02
58.7
REMUL
Backbone=Qwen3-14B
2026.02
56.6
MAT-Steer
Backbone=Qwen3-14B
2026.02
52.7
Original
Backbone=Qwen3-14B
2026.02
52
Balanced Rewards
Backbone=Qwen3-14B
2026.02
51.2
Hint Optimized
Backbone=Qwen3-14B
2026.02
48.1
Correctness Only
Backbone=Qwen3-14B
2026.02
47.4
Feedback
Search any
task
Search any
task