Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on NQ SYNTH
Loading...
79
Accuracy
NWCAD
16.86
32.9925
49.125
65.2575
Apr 17, 2026
Accuracy
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy
NWCAD
Model=Llama-3.1-70B
2026.04
79
AdaCAD
Model=Llama-3.1-70B
2026.04
76.38
AdaCAD
Model=Ministral-3-8B
2026.04
72.8
NWCAD
Model=Ministral-3-8B
2026.04
71.8
CoCoA
Model=Llama-3.1-70B
2026.04
71.62
With-context
Model=Ministral-3-8B
2026.04
70.7
NWCAD
Model=Llama-3.1-8B
2026.04
70
AdaCAD
Model=Llama-3.1-8B
2026.04
69.8
With-context
Model=Llama-3.1-8B
2026.04
69.6
With-context
Model=Llama-3.1-70B
2026.04
69.5
CAD
Model=Llama-3.1-70B
2026.04
66.62
CAD
Model=Ministral-3-8B
2026.04
64.8
CoCoA
Model=Llama-3.1-8B
2026.04
64.4
CAD
Model=Llama-3.1-8B
2026.04
64.3
CoCoA
Model=Ministral-3-8B
2026.04
58.9
Baseline
Model=Llama-3.1-8B
2026.04
25.5
Baseline
Model=Ministral-3-8B
2026.04
20.5
Baseline
Model=Llama-3.1-70B
2026.04
19.25
Feedback
Search any
task
Search any
task