Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ChaosNLI

Benchmarks

Task NameDataset NameSOTA ResultTrend
Natural Language InferenceChaosNLI S easy (test)
Top-1 Accuracy62.4
27
Natural Language InferenceChaosNLI S_amb (test)
Top-1 Accuracy41.8
27
Natural Language InferenceChaosNLI (test)
Top-1 Accuracy50.5
27
Probability CalibrationChaosNLI combined SNLI+MNLI K=3
ECE2.15
26
PMI ranking estimationChaosNLI (500 held-out pairs)
Spearman Rho0.82
16
Natural Language Inference Distribution EstimationChaosNLI
KL Divergence0
12
Natural Language Inference Distribution EstimationChaosNLI 1.0 (test)
KL Div (Dist)0
8
Natural Language Inference Distribution EstimationChaosNLI 1.0 (dev)
KL Divergence (BERT FT)0.073
8
Natural Language Inference Distribution EstimationChaosNLI (test)
KL Divergence0.06
4
Showing 9 of 9 rows