Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Continual Learning of Long Topic Sequences in Neural Information Retrieval

About

In information retrieval (IR) systems, trends and users' interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to understand the transfer capacity of recent IR approaches to address new domains in the long term. In this paper, we first propose a dataset based upon the MSMarco corpus aiming at modeling a long stream of topics as well as IR property-driven controlled settings. We then in-depth analyze the ability of recent neural IR models while continually learning those streams. Our empirical study highlights in which particular cases catastrophic forgetting occurs (e.g., level of similarity between tasks, peculiarities on text length, and ways of learning models) to provide future directions in terms of model design.

Thomas Gerald, Laure Soulier• 2022

Related benchmarks

TaskDatasetResultRank
Continual RetrievalMSMARCO streaming topic-clustered (Session 9)
Success@592.2
14
Continual RetrievalMSMARCO streaming topic-clustered (Session 6)
Success@587.4
14
Continual RetrievalMSMARCO streaming topic-clustered (Session 8)
Success@584.8
14
Continual RetrievalMSMARCO
S@578.29
14
Continual RetrievalMSMARCO streaming topic-clustered (Session 5)
Success@568.1
14
Continual RetrievalMSMARCO streaming topic-clustered (Session 7)
Success@571.1
14
RetrievalLoTTE Session 7
Success@533.2
14
Continual RetrievalLoTTE
Success@50.4213
14
Continual RetrievalMSMARCO streaming topic-clustered (Session 4)
Success@521.9
14
Continual RetrievalMSMARCO streaming topic-clustered Average
Success@547.76
14
Showing 10 of 24 rows

Other info

Follow for update