Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STEM Question Answering on SciQ (first-token accuracy)

98.3First-Token Accuracy

Llama-3.1-8B

29.1447.09565.0583.005May 21, 2025
Updated 13d ago

Evaluation Results

MethodLinks
2025.05
98.3
2025.05
98.3
2025.05
98.3
2025.05
98.2
2025.05
98.1
2025.05
98
2025.05
97.8
2025.05
97.5
2025.05
97.4
2025.05
97.4
2025.05
97.3
2025.05
97.2
2025.05
97.2
2025.05
97.1
2025.05
96.8
2025.05
95.9
2025.05
95.4
2025.05
95.1
2025.05
93.6
2025.05
93.1
2025.05
89.6
2025.05
89
2025.05
88.5
2025.05
31.8