Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Logical Reasoning on ProofWriter (Acc, TNFT, Tokens, Delay)
Loading...
92.2
Acc
Vanilla
88.664
89.582
90.5
91.418
May 12, 2026
Acc
TNFT
Tokens
Delay
Updated 21d ago
Evaluation Results
Method
Method
Links
Acc
TNFT
Tokens
Delay
Vanilla
Backbone=Qwen3-4B
2026.05
92.2
195.14
1,125.13
31.6
Ours
Backbone=Qwen3-4B
2026.05
91.8
0
736.84
24.56
Vanilla
Backbone=Qwen3-1.7B
2026.05
90.7
195.14
1,227.19
26.46
Base
Backbone=Qwen3-4B
2026.05
90.5
197.14
2,066.1
62.94
Ours
Backbone=Qwen3-1.7B
2026.05
90.2
0
714.79
20.42
Base
Backbone=Qwen3-1.7B
2026.05
88.8
197.14
2,110.24
49.62
Feedback
Search any
task
Search any
task