Parallel Split Learning with Global Sampling
About
Parallel split learning (PSL) suffers from two intertwined issues: the effective batch size grows with the number of clients, and data that is not identically and independently distributed (non-IID) skews global batches. We present parallel split learning with global sampling (GPSL), a server-driven scheme that fixes the global batch size while computing per-client batch-size schedules using pooled-level proportions. The actual samples are drawn locally without replacement by each selected client. This eliminates per-class rounding, decouples the effective batch from the client count, and makes each global batch distributionally equivalent to centralized uniform sampling without replacement. Consequently, we obtain finite-population deviation guarantees via Serfling's inequality, yielding a zero rounding bias compared to local sampling schemes. GPSL is a drop-in replacement for PSL with negligible overhead and scales to large client populations. In extensive experiments on CIFAR-10/100 and ResNet-18/34 under non-IID splits, GPSL stabilizes optimization and achieves centralized-like accuracy, while fixed local batching trails by up to 60%. Furthermore, GPSL shortens training time by avoiding inflation of training steps induced by data-depletion. These findings suggest GPSL is a promising and scalable approach for learning in resource-constrained environments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 IID | Accuracy84.74 | 166 | |
| Image Classification | CIFAR-100 non-IID (test) | Test Accuracy (Avg Best)60.04 | 113 | |
| Image Classification | CIFAR-10 non-IID (test) | Average Test Accuracy89.69 | 14 | |
| Image Classification | CIFAR-10 Mild Non-IID (C=5, α=3.0) (test) | Top-1 Accuracy84.52 | 13 | |
| Image Classification | CIFAR-10 Severe Non-IID (C=2, alpha=3.0) (test) | Top-1 Accuracy84.71 | 12 |