Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Positive-Congruent Training: Towards Regression-Free Model Updates

About

Reducing inconsistencies in the behavior of different versions of an AI system can be as important in practice as reducing its overall error. In image classification, sample-wise inconsistencies appear as "negative flips": A new model incorrectly predicts the output for a test sample that was correctly classified by the old (reference) model. Positive-congruent (PC) training aims at reducing error rate while at the same time reducing negative flips, thus maximizing congruency with the reference model only on positive predictions, unlike model distillation. We propose a simple approach for PC training, Focal Distillation, which enforces congruence with the reference model by giving more weights to samples that were correctly classified. We also found that, if the reference model itself can be chosen as an ensemble of multiple deep neural networks, negative flips can be further reduced without affecting the new model's accuracy.

Sijie Yan, Yuanjun Xiong, Kaustav Kundu, Shuo Yang, Siqi Deng, Meng Wang, Wei Xia, Stefano Soatto• 2020

Related benchmarks

TaskDatasetResultRank
Natural Language UnderstandingGLUE (dev)
SST-2 (Acc)93.12
504
Dependency ParsingDependency Parsing
LCM Accuracy60.64
15
Dependency ParsingDependency Parsing deepbiaf → deepbiaf NeuroNLP2 implementation (unlabeled metrics)
UCM Accuracy67.21
10
Conversational Semantic ParsingTOP s2s-base-part ⇒ s2s-base
EM Accuracy86.9
5
Conversational Semantic ParsingTOP s2s-large-part ⇒ s2s-large
EM Accuracy87.65
5
Dependency ParsingDependency Parsing stackptr → stackptr unlabeled metrics NeuroNLP2 implementation
UCM Accuracy67.21
5
Conversational Semantic ParsingTOP s2s-base ⇒ s2s-large
EM Accuracy87.65
5
Showing 7 of 7 rows

Other info

Follow for update