Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MASSIVE

Benchmarks

Task NameDataset NameSOTA ResultTrend
Sequence ClassificationMASSIVE
Micro F180.36
64
Short-text ClusteringMassive (test)
NMI78.88
20
Text ClassificationMASSIVE (test)
Accuracy78.6
18
Intent ClassificationMASSIVE (test)
In-Scope Accuracy89.47
17
ClassificationMassiveIntentClassification
Accuracy77.08
11
Intent ClassificationMASSIVE-Intent (test)
CFT Score80.73
8
Slot FillingMASSIVE-Slot (test)
CFT62.54
8
Slot FillingMASSIVE Slotfill
F157.3
8
Intent ClassificationMASSIVE Intent
Accuracy80.7
8
Intent ClassificationMASSIVE
In-Scope Accuracy66
8
Intent ClassificationMASSIVE W5H2
Cost/1K0
7
Intent ClusteringMassive (I)
NMI0.7812
6
Intent ClassificationMASSIVE W5H2 (test)
Accuracy97.3
4
Out-of-Distribution DetectionMassive (test)
AUROC0.9679
4
Uncertainty CalibrationMASSIVE (test)
ECE0.059
4
CalibrationMASSIVE
ECE (Wrong Samples)0.586
4
Intent ClusteringMASSIVE (test)
ARI0.3
4
Short-text ClusteringMassive
NMI-
0
Showing 18 of 18 rows