Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Near-optimal Linear Predictive Clustering in Non-separable Spaces via MIP and QPBO Reductions

About

Linear Predictive Clustering (LPC) partitions samples based on shared linear relationships between feature and target variables, with numerous applications including marketing, medicine, and education. Greedy optimization methods, commonly used for LPC, alternate between clustering and linear regression but lack global optimality. While effective for separable clusters, they struggle in non-separable settings where clusters overlap in feature space. In an alternative constrained optimization paradigm, Bertsimas and Shioda (2007) formulated LPC as a Mixed-Integer Program (MIP), ensuring global optimality regardless of separability but suffering from poor scalability. This work builds on the constrained optimization paradigm to introduce two novel approaches that improve the efficiency of global optimization for LPC. By leveraging key theoretical properties of separability, we derive near-optimal approximations with provable error bounds, significantly reducing the MIP formulation's complexity and improving scalability. Additionally, we can further approximate LPC as a Quadratic Pseudo-Boolean Optimization (QPBO) problem, achieving substantial computational improvements in some settings. Comparative analyses on synthetic and real-world datasets demonstrate that our methods consistently achieve near-optimal solutions with substantially lower regression errors than greedy optimization while exhibiting superior scalability over existing MIP formulations.

Jiazhou Liang, Hassan Khurram, Scott Sanner• 2025

Related benchmarks

TaskDatasetResultRank
ClusteringStock Portfolio Performance 390
Regression Coefficient Difference18.393
3
ClusteringServo
Regression Coefficient Difference4.7162
3
ClusteringSolar Flare 89
Regression Coefficient Difference3.4549
3
ClusteringProductivity Prediction 597
Regression Coefficient Difference9.7577
3
ClusteringHeart Failure Prediction 519
Regression Coefficient Difference1.8778
3
ClusteringFacebook Metrics
Regression Coefficient Difference0.00e+0
3
ClusteringStock Portfolio Performance 390
Cluster Assignment Mismatch8.73
3
ClusteringServo 87
Cluster Assignment Mismatch37.68
3
ClusteringSolar Flare 89
Cluster Assignment Mismatch22.17
3
Clusteringliver-disorders
Cluster Assignment Mismatch44.93
3
Showing 10 of 27 rows

Other info

Follow for update