Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SciQ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringSciQ
Accuracy96
226
Multiple Choice Question AnsweringSciQ
Accuracy100
74
Science Question AnsweringSciQ
Normalized Accuracy97
44
Question AnsweringSciQ (train)
Accuracy100
36
Uncertainty quantificationSciQ (test)
AUROC74.5
28
Factual Question AnsweringSciQ (ID)
Precision76.44
24
Multiple Choice Question AnsweringSciQ MC
Mean Per-Step Regret0.137
15
Question AnsweringSciQ Abstract
Mean per-step regret0.135
15
Distractor GenerationSciq (test)
Precision@124.3
15
Question AnsweringSciQ (test)
Accuracy76.6
13
Language ModelingSciQ
Perplexity11.95
13
Question AnsweringSciQ (D_eval)
Accuracy71.4
12
Reading ComprehensionSciQ
Accuracy93.7
11
Science Question AnsweringSciQ standard (test)
Accuracy90.2
8
Downstream TaskSciQ
Accuracy89.3
7
Question answeringSciQ-ar
Accuracy55.68
6
Question AnsweringSciQ
ANLL52.8
4
Hallucination detectionSciQ (test)
ANLL59.9
4
Disciplinary KnowledgeSciQ
Accuracy81.1
4
Question AnsweringSciQ Abstract
Accuracy80.6
3
Reading ComprehensionSCIQ
Exact Match58.92
3
Question AnsweringSciQ
Exact Match75.98
3
Multiple Choice Question AnsweringSciQ MC
Accuracy86.7
2
Question AnsweringSciQ
MAE0.0045
2
Question AnsweringSciQ MC
Accuracy86.7
1
Showing 25 of 26 rows