Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on NaturalQA (test)
Loading...
23.4
Accuracy
Prompting
-0.936
5.382
11.7
18.018
Oct 1, 2025
Accuracy
Precision
Updated 5d ago
Evaluation Results
Method
Method
Links
Accuracy
Precision
Prompting
shots=5-shot
2025.10
23.4
42.5
DPO
2025.10
22.3
56.2
AFH
mode=Absolute
2025.10
21.7
43.3
MASH w/ OTC-ST
2025.10
20.9
57.4
MASH w/ EXP
2025.10
18.9
53.6
AFH
mode=Multisample
2025.10
14.7
54.8
MASH w/ OTC
2025.10
0.1
53.6
OTC
2025.10
0
-
Ternary
2025.10
0
-
Feedback
Search any
task
Search any
task