SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams
About
Due to the dynamically evolving nature of real-world query streams, relevance models struggle to generalize to practical search scenarios. A sophisticated solution is self-evolution techniques. However, in large-scale industrial settings with massive query streams, this technique faces two challenges: (1) informative samples are often sparse and difficult to identify, and (2) pseudo-labels generated by the current model could be unreliable. To address these challenges, in this work, we propose a Self-Evolving Relevance Model approach (SERM), which comprises two complementary multi-agent modules: a multi-agent sample miner, designed to detect distributional shifts and identify informative training samples, and a multi-agent relevance annotator, which provides reliable labels through a two-level agreement framework. We evaluate SERM in a large-scale industrial setting, which serves billions of user requests daily. Experimental results demonstrate that SERM can achieve significant performance gains through iterative self-evolution, as validated by extensive offline multilingual evaluations and online testing.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Search Relevance | Germanic language family Qwen2.5 series benchmark (test) | NDCG@187.56 | 18 | |
| Search Relevance | Romance language family Qwen2.5 series benchmark (test) | NDCG@188.14 | 18 | |
| Search Relevance | Minor Language family Qwen2.5 series benchmark (test) | NDCG@184.99 | 18 | |
| Search Relevance | Online Search Platform Overall Current (Live Traffic) | User Negative Feedback-1.2081 | 1 | |
| Search Relevance | Online Search Platform Longtail Traffic Current | Change Query Ratio-0.1312 | 1 |