Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Symbolic Reasoning on Date Understanding (DU)
Loading...
87.2
Accuracy
SoftCoT
52.8384
61.7592
70.68
79.6008
Feb 17, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
SoftCoT
N=10, Backbone=Qwen3-8B
2025.02
87.2
SoftCoT
N=1, Backbone=Qwen3-8B
2025.02
85.6
Zero-Shot Assist-CoT
N=10, Backbone=Qwen3-8B
2025.02
84.8
Zero-Shot CoT
N=10, Backbone=Qwen3-8B
2025.02
84.56
Zero-Shot Assist-CoT
N=1, Backbone=Qwen3-8B
2025.02
80.56
Zero-Shot CoT
N=1, Backbone=Qwen3-8B
2025.02
80.32
SoftCoT
Backbone=LLaMA-3.1-8B-...
2025.02
59.04
Zero-Shot Assist-CoT
Backbone=LLaMA-3.1-8B-...
2025.02
58.24
Zero-Shot CoT
Backbone=LLaMA-3.1-8B-...
2025.02
54.4
Zero-Shot CoT-Unk
Backbone=LLaMA-3.1-8B-...
2025.02
54.16
Feedback
Search any
task
Search any
task