Multinoulli Extension: A Lossless Continuous Relaxation for Partition-Constrained Subset Selection
About
Identifying the most representative subset for a close-to-submodular objective while satisfying the predefined partition constraint is a fundamental task with numerous applications in machine learning. However, the existing distorted local-search methods are often hindered by their prohibitive query complexities and the rigid requirement for prior knowledge of difficult-to-obtain structural parameters. To overcome these limitations, we introduce a novel algorithm titled Multinoulli-SCG, which not only is parameter-free, but also can achieve the same approximation guarantees as the distorted local-search methods with significantly fewer function evaluations. More specifically, when the objective function is monotone $\alpha$-weakly DR-submodular or $(\gamma,\beta)$-weakly submodular, our Multinoulli-SCG algorithm can attain a value of $(1-e^{-\alpha})\text{OPT}-\epsilon$ or $(\frac{\gamma^{2}(1-e^{-(\beta(1-\gamma)+\gamma^2)})}{\beta(1-\gamma)+\gamma^2})\text{OPT}-\epsilon$ with only $O(1/\epsilon^{2})$ function evaluations, where OPT denotes the optimal value. The cornerstone of our Multinoulli-SCG algorithm is an innovative continuous-relaxation framework named Multinoulli Extension(ME), which can effectively convert the discrete subset selection problem subject to partition constraints into a solvable continuous maximization focused on learning the optimal multinoulli priors across the concerned partition. In sharp contrast with the well-established multi-linear extension for submodular subset selection, a notable advantage of our proposed ME is its intrinsic capacity to provide a lossless rounding scheme for any set function. Furthermore, based on our proposed ME, we also present two novel online algorithms, namely, Multinoulli-OSCG and Multinoulli-OSGA, for the unexplored online subset selection problems over partition constraints.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Online Subset Selection | Partition-constrained subset selection | Approximation Ratio2 | 7 | |
| Bayesian A-optimal Design | Housing | Objective Value54.64 | 6 | |
| Bayesian A-optimal Design | ionosphere | Objective Function Value282.4 | 6 | |
| Bayesian A-optimal Design | SONAR | Objective Value613.8 | 6 | |
| Coverage Maximization | Video Summarization n=30, k=6 | Coverage Objective Score53 | 6 | |
| Video Summarization | Cooking website Video sequence 1.0 | Object Score50.63 | 6 | |
| Video Summarization | Animation website Video sequence V2 1.0 | Object Score79.99 | 6 | |
| Video Summarization | VSUMM Soccer V3 1.0 | Object Score50.27 | 6 | |
| Video Summarization | VSUMM Live news V4 1.0 | Obj Coverage19.53 | 6 | |
| Video Summarization | VSUMM Broadcast V5 1.0 | Obj Score55.14 | 6 |