Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TriviaQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hallucination DetectionTriviaQA
AUROC0.95
265
Question AnsweringTriviaQA
Accuracy86.68
210
Hallucination DetectionTriviaQA (test)
AUC-ROC92.23
169
Question AnsweringTriviaQA (test)
Accuracy85.18
121
Question AnsweringTriviaQA
EM86.1
116
Question AnsweringTriviaQA
Accuracy94.5
85
Open-Domain Question AnsweringTriviaQA (test)
Exact Match72.6
80
Uncertainty EstimationTriviaQA (test)
AUROC82.12
78
Passage retrievalTriviaQA (test)
Top-100 Acc90.1
67
Question AnsweringTriviaQA
ACC75
62
Open-domain Question AnsweringTriviaQA
EM76.1
62
Single-hop Question AnsweringTriviaQA
EM72
62
Open-domain Question AnsweringTriviaQA open (test)
EM73.3
59
Question AnsweringTriviaQA (TQA)
EM71.1
56
Retrieval-Augmented Generation (RAG)TriviaQA
Reliability Score80.67
52
Question AnsweringTriviaQA
C79.9
48
Question AnsweringTriviaQA Wiki (val)
Exact Match (EM)87.6
48
Question AnsweringTriviaQA
F189.02
46
Question AnsweringTriviaQA (TQA) (test)
Robust Accuracy75.4
45
Open-Domain Question AnsweringTriviaQA
SubEM74.01
40
Question AnsweringTriviaQA
C79.9
40
End-to-end Open-Domain Question AnsweringTriviaQA (test)
Exact Match (EM)71.5
40
CalibrationTriviaQA
Brier Score0.0845
39
General Question AnsweringTriviaQA
Exact Match69.02
39
Uncertainty EstimationTriviaQA
AUROC83.63
37
Showing 25 of 176 rows
...