An Adapter-free Fine-tuning Approach for Tuning 3D Foundation Models

About

Point cloud foundation models demonstrate strong generalization, yet adapting them to downstream tasks remains challenging in low-data regimes. Full fine-tuning often leads to overfitting and significant drift from pre-trained representations, while existing parameter-efficient fine-tuning (PEFT) methods mitigate this issue by introducing additional trainable components at the cost of increased inference-time latency. We propose Momentum-Consistency Fine-Tuning (MCFT), an adapter-free approach that bridges the gap between full and parameter-efficient fine-tuning. MCFT selectively fine-tunes a portion of the pre-trained encoder while enforcing a momentum-based consistency constraint to preserve task-agnostic representations. Unlike PEFT methods, MCFT introduces no additional representation learning parameters beyond a standard task head, maintaining the original model's parameter count and inference efficiency. We further extend MCFT with two variants: a semi-supervised framework that leverages abundant unlabeled data to enhance few-shot performance, and a pruning-based variant that improves computational efficiency through structured layer removal. Extensive experiments on object recognition and part segmentation benchmarks demonstrate that MCFT consistently outperforms prior methods, achieving a 3.30% gain in 5-shot settings and up to a 6.13% improvement with semi-supervised learning, while remaining well-suited for resource-constrained deployment.

Sneha Paul, Zachary Patterson, Nizar Bouguila• 2026

Related benchmarks

Task	Dataset	Result
Part Segmentation	ShapeNetPart	mIoU (Instance)89	246
3D Object Classification	ModelNet40 few-shot	Accuracy82.93	70
Classification	ScanObjectNN	OA93.1	67
object recognition	ModelNet40 5-way	Accuracy98.3	40
object recognition	ModelNet40 10-way	Accuracy95.9	30
object recognition	ScanObjectNN fully-supervised (PB)	Overall Accuracy (OA)90.8	28
object recognition	ModelNet40 fully-supervised (test)	Overall Accuracy (OA)95.2	26
object recognition	ScanObjectNN fully-supervised (BG)	Overall Accuracy (OA)94.9	24
object recognition	ModelNet40 20-shot	Accuracy (20-shot)86.83	10
object recognition	ScanObjectNN OBJ_ONLY 5-shot	Accuracy61.1	10

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord