Parallel Split Learning with Global Sampling

About

Parallel split learning (PSL) suffers from two intertwined issues: the effective batch size grows with the number of clients, and data that is not identically and independently distributed (non-IID) skews global batches. We present parallel split learning with global sampling (GPSL), a server-driven scheme that fixes the global batch size while computing per-client batch-size schedules using pooled-level proportions. The actual samples are drawn locally without replacement by each selected client. This eliminates per-class rounding, decouples the effective batch from the client count, and makes each global batch distributionally equivalent to centralized uniform sampling without replacement. Consequently, we obtain finite-population deviation guarantees via Serfling's inequality, yielding a zero rounding bias compared to local sampling schemes. GPSL is a drop-in replacement for PSL with negligible overhead and scales to large client populations. In extensive experiments on CIFAR-10/100 and ResNet-18/34 under non-IID splits, GPSL stabilizes optimization and achieves centralized-like accuracy, while fixed local batching trails by up to 60%. Furthermore, GPSL shortens training time by avoiding inflation of training steps induced by data-depletion. These findings suggest GPSL is a promising and scalable approach for learning in resource-constrained environments.

Mohammad Kohankhaki, Ahmad Ayad, Mahdi Barhoush, Anke Schmeink• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	CIFAR-10 IID	Accuracy84.74	185
Image Classification	CIFAR-100 non-IID (test)	Test Accuracy (Avg Best)60.04	113
Image Classification	CIFAR-10 non-IID (test)	Average Test Accuracy89.69	110
Image Classification	CIFAR-10 Mild Non-IID (C=5, α=3.0) (test)	Top-1 Accuracy84.52	13
Image Classification	CIFAR-10 Severe Non-IID (C=2, alpha=3.0) (test)	Top-1 Accuracy84.71	12

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord