Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on Restate hard
Loading...
94.4
Accuracy
NWCAD
37.2832
52.1116
66.94
81.7684
Apr 17, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
NWCAD
Model=Llama-3.1-70B
2026.04
94.4
With-context
Model=Llama-3.1-70B
2026.04
92.8
AdaCAD
Model=Llama-3.1-70B
2026.04
92.2
NWCAD
Model=Llama-3.1-8B
2026.04
90.02
CoCoA
Model=Llama-3.1-70B
2026.04
89.6
With-context
Model=Llama-3.1-8B
2026.04
89.55
AdaCAD
Model=Llama-3.1-8B
2026.04
89.55
NWCAD
Model=Ministral-3-8B
2026.04
88.33
With-context
Model=Ministral-3-8B
2026.04
87.86
AdaCAD
Model=Ministral-3-8B
2026.04
86.94
CoCoA
Model=Llama-3.1-8B
2026.04
85.1
CAD
Model=Ministral-3-8B
2026.04
78.96
CAD
Model=Llama-3.1-8B
2026.04
75.58
CoCoA
Model=Ministral-3-8B
2026.04
74.19
CAD
Model=Llama-3.1-70B
2026.04
67.6
Baseline
Model=Llama-3.1-8B
2026.04
57.76
Baseline
Model=Llama-3.1-70B
2026.04
50
Baseline
Model=Ministral-3-8B
2026.04
39.48
Feedback
Search any
task
Search any
task