Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model
About
Many crucial biological processes rely on networks of protein-protein interactions. Predicting the effect of amino acid mutations on protein-protein binding is vital in protein engineering and therapeutic discovery. However, the scarcity of annotated experimental data on binding energy poses a significant challenge for developing computational approaches, particularly deep learning-based methods. In this work, we propose SidechainDiff, a representation learning-based approach that leverages unlabelled experimental protein structures. SidechainDiff utilizes a Riemannian diffusion model to learn the generative process of side-chain conformations and can also give the structural context representations of mutations on the protein-protein interface. Leveraging the learned representations, we achieve state-of-the-art performance in predicting the mutational effects on protein-protein binding. Furthermore, SidechainDiff is the first diffusion-based generative model for side-chains, distinguishing it from prior efforts that have predominantly focused on generating protein backbone structures.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Binding affinity prediction | SKEMPI2 (all mutations) | Pearson Corr (Overall)0.669 | 24 | |
| Binding affinity prediction | SKEMPI2 single mutations | Pearson Correlation (Overall)0.672 | 12 | |
| Side-chain conformation prediction | PDB-REDO (test) | Chi 1 MAE18 | 6 | |
| Binding affinity prediction | SARS-CoV-2 RBD (PDB ID: 6M0J) (285 single-point mutations) | Pearson R0.466 | 4 | |
| Mutational effect prediction | Human antibody against SARS-CoV-2 RBD (PDB ID: 7FAE) heavy chain CDR region | TH31W7.29 | 4 |