Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RACE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Reading ComprehensionRACE high
Accuracy94.5
295
Reading ComprehensionRACE mid
Accuracy89.9
196
Reading ComprehensionRACE
Accuracy89.95
151
Machine Reading ComprehensionRACE (test)
RACE Accuracy (Medium)95.4
111
Reading ComprehensionRACE
Accuracy74.93
75
Multiple Choice Question AnsweringRACE
Accuracy98.24
64
Reading ComprehensionRACE
Accuracy68.3
59
Out-of-Distribution DetectionRACE to MMLU
AUROC87.57
41
Machine Reading ComprehensionRACE
RACE Overall Accuracy94.5
38
Reading ComprehensionRACE-m
Accuracy0.931
31
Reading ComprehensionRACE-h
Accuracy62.3
26
Reading ComprehensionRACE-h (test)
Accuracy89.95
26
Uncertainty EstimationRACE Llama-3.1-8B and Gemma-2-9B backbones (test)
AUROC91.3
24
Reading ComprehensionRACE
First-Token Accuracy87.3
24
Question AnsweringRACE MRQA out-of-domain evaluation
EM46.3
23
Reading ComprehensionRACE
RACE Middle Score70.2
21
Distribution AlignmentRace Even
MAE0.072
20
UnderstandingRACE Middle
Score67.27
20
Question AnsweringRACE-C
Accuracy93.66
19
Reading ComprehensionRace M
Race M Score45.47
18
Reading ComprehensionRace-H
RACE-h Score38.56
18
Reading ComprehensionRACE Middle School
Accuracy (RACE MS)95.4
16
Reading ComprehensionRACE (dev)
Accuracy88.1
16
Reading ComprehensionRACE
Score91
15
Difficulty-controllable Question GenerationRACE (test)
Estimated Difficulty2.18
15
Showing 25 of 64 rows