Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Confidence Calibration on MultiNLI Mismatch (test)

0.0071ECE

MIR

0.0033280.0287890.054250.079711Jun 7, 2023
Updated 4d ago

Evaluation Results

MethodLinks
2023.06
0.00710.01070.00260.0258
2023.06
0.00880.01070.00360.0246
2023.06
0.01030.01360.0040.0235
2023.06
0.01050.0150.00250.0251
2023.06
0.01110.01170.00360.0376
2023.06
0.01220.01350.0040.0278
2023.06
0.01240.01260.00390.0277
2023.06
0.01430.01230.00340.0261
2023.06
0.01450.01460.00370.0278
2023.06
0.01580.01560.00660.0304
2023.06
0.0170.01760.00410.0303
2023.06
0.02420.02460.00930.0356
2023.06
0.02470.02620.00850.0354
2023.06
0.04260.0440.02830.0515
2023.06
0.07530.0750.04210.0778
2023.06
0.10140.10140.0740.1033