Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Measuring and Reducing Model Update Regression in Structured Prediction for NLP

About

Recent advance in deep learning has led to the rapid adoption of machine learning-based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor. This work studies model update regression in structured prediction tasks. We choose syntactic dependency parsing and conversational semantic parsing as representative examples of structured prediction tasks in NLP. First, we measure and analyze model update regression in different model update settings. Next, we explore and benchmark existing techniques for reducing model update regression including model ensemble and knowledge distillation. We further propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured prediction. Experiments show that BCR can better mitigate model update regression than model ensemble and knowledge distillation approaches.

Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang• 2022

Related benchmarks

TaskDatasetResultRank
Dependency ParsingDependency Parsing
LCM Accuracy60.02
15
Dependency ParsingDependency Parsing deepbiaf → deepbiaf NeuroNLP2 implementation (unlabeled metrics)
UCM Accuracy66.76
10
Conversational Semantic ParsingTOP s2s-base-part ⇒ s2s-base
EM Accuracy86.16
5
Conversational Semantic ParsingTOP s2s-large-part ⇒ s2s-large
EM Accuracy87.41
5
Dependency ParsingDependency Parsing stackptr → stackptr unlabeled metrics NeuroNLP2 implementation
UCM Accuracy66.45
5
Conversational Semantic ParsingTOP s2s-base ⇒ s2s-large
EM Accuracy87.49
5
Showing 6 of 6 rows

Other info

Follow for update