Natural Language Inference on MultiNLI mismatched (test)

81.4Accuracy

LM-Transformer

Updated 4mo ago

Evaluation Results

Method	Links
LM-Transformer 2018.05		81.4
DRCN+ELMo 2018.05		81.4
DRCN 2018.05		79.5
CAFE 2018.05		79
DIIN 2017.11		78.8
DIIN 2017.09		78.7
DIIN 2018.05		78.7
DRCN 2018.05		78.4
CAFE 2018.05		77.9
CAFE 2017.11		77.9
DIIN 2017.09		77.8
DIIN 2018.05		77.8
KIM 2017.11		76.4
ESIM 2017.11		75.8
Gated-Att BiLSTM 2017.09		74.9
BiLSTM with generalized pooling 2018.09		74
ESIM GloVe 2017.06		73.92
2-layer Bi-CAS-LSTM 2018.09		73.7
Deep Gated Attn. BiLSTM encoders 2017.12		73.6
Gated-Att BiLSTM 2017.09		73.6
Shortcut-Stacked encoder 2017.09		73.6
Deep Gated Attn. BiLSTM encoders 2018.08		73.6
Shortcut stacked BiLSTM 2018.09		73.6
BiLSTM with gated pooling 2018.09		73.6
Gated BiLSTM 2017.11		73.6
SS BILSTM 2017.11		73.6
Shortcut-Stacked BiLSTM encoders 2017.12		73.5
Shortcut-Stacked BiLSTM 2018.08		73.5
3-layer Bi-CAS-LSTM 2018.09		73.4
2-layer CAS-LSTM 2018.09		73.3
3-layer CAS-LSTM 2018.09		73.1
HBMP 2018.08		73
Distance-based Self-Attention Network 2017.12		72.9
InnerAtt 2017.09		72.8
BiLSTM + Inner-attention 2017.12		72.1
InnerAtt 2017.09		72.1
ESIM 2017.09		72.1
ESIM 2018.05		72.1
BiLSTM + Inner-attention 2018.08		72.1
Directional Self-Attention Network 2017.12		71.4
DiSAN 2017.11		71.4
BiLSTM + enhanced embedding + max pooling 2017.12		70.8
BiLSTM + enh embed + max pooling 2018.08		70.8
ESIM dictionary 2017.06		70.7
ESIM spelling 2017.06		69.76
ESIM baseline 2017.06		68.57
Cha-level Intra-attention BiLSTM encoders 2017.12		68.2
BiLSTM 2017.09		67.6
BiLSTM 2017.12		67.1
BiLSTM 2018.08		67.1
BiLSTM 2018.09		66.9
BILSTM 2017.11		66.9
CBOW 2017.12		64.6
CBOW 2018.08		64.6
CBOW 2018.09		64.5
CBOW 2017.11		64.5