MASSIVE

Benchmarks

Task Name	Dataset Name	SOTA Result
Sequence Classification	MASSIVE	Micro F180.36	64
Short-text Clustering	Massive (test)	NMI78.88	20
Text Classification	MASSIVE (test)	Accuracy78.6	18
Clustering	Massive-D	Accuracy78	17
Intent Classification	MASSIVE (test)	In-Scope Accuracy89.47	17
Slot Filling	MASSIVE Slotfill	F157.3	14
Classification	MassiveIntentClassification	Accuracy77.08	11
Clustering	MASSIVE Intent	ARI0.62	10
Intent Classification	MASSIVE (unsupervised)	Accuracy79.15	9
Selective Prediction	MASSIVE (test)	Guaranteed Test Coverage (alpha=0.10)100	8
Intent Classification	MASSIVE-Intent (test)	CFT Score80.73	8
Slot Filling	MASSIVE-Slot (test)	CFT62.54	8
Intent Classification	MASSIVE Intent	Accuracy80.7	8
Intent Classification	MASSIVE	In-Scope Accuracy66	8
Clustering	Massive I	Accuracy60.5	7
Intent Classification	MASSIVE W5H2	Cost/1K0	7
Intent Clustering	Massive (I)	NMI0.7812	6
Text Classification	Massive	Label Quality68	5
Out-of-Distribution Intent Detection	MASSIVE	F1-Macro87.6	5
Intent Classification	MASSIVE (full)	F1-Macro87.6	5
Intent Classification	MASSIVE W5H2 (test)	Accuracy97.3	4
Out-of-Distribution Detection	Massive (test)	AUROC0.9679	4
Uncertainty Calibration	MASSIVE (test)	ECE0.059	4
Calibration	MASSIVE	ECE (Wrong Samples)0.586	4
Intent Clustering	MASSIVE (test)	ARI0.3	4

Showing 25 of 30 rows