Protein Inverse Folding From Structure Feedback

About

The inverse folding problem, aiming to design amino acid sequences that fold into desired three-dimensional structures, is pivotal for various biotechnological applications. Here, we introduce a novel approach leveraging Direct Preference Optimization (DPO) to fine-tune an inverse folding model using feedback from a protein folding model. Given a target protein structure, we begin by sampling candidate sequences from the inverse-folding model, then predict the three-dimensional structure of each sequence with the folding model to generate pairwise structural-preference labels. These labels are used to fine-tune the inverse-folding model under the DPO objective. Our results on the CATH 4.2 test set demonstrate that DPO fine-tuning not only improves sequence recovery of baseline models but also leads to a significant improvement in average TM-Score from 0.77 to 0.81, indicating enhanced structure similarity. Furthermore, iterative application of our DPO-based method on challenging protein structures yields substantial gains, with an average TM-Score increase of 79.5\% with regard to the baseline model. This work establishes a promising direction for enhancing protein sequence design ability from structure feedback by effectively utilizing preference optimization.

Junde Xu, Zijun Gao, Xinyi Zhou, Jie Hu, Xingyi Cheng, Le Song, Guangyong Chen, Pheng-Ann Heng, Jiezhong Qiu• 2025

Related benchmarks

Task	Dataset	Result
Protein Sequence Design	CATH 4.3 (150-300 residues)	TM Score84.67	17
Protein Sequence Design	CATH 0-150 residues 4.3	TM Score84.36	17
Protein Inverse Folding	CATH 0-150 residues 4.3 (test)	Recovery Rate56.9	7
Protein Inverse Folding	CATH 150-300 residues 4.3 (test)	Recovery Rate56.9	7

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord