Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures
About
Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep 3D analysis of large-scale protein data. First, we introduce a novel convolution operator which considers both, the intrinsic (invariant under protein folding) as well as extrinsic (invariant under bonding) structure, by using $n$-D convolutions defined on both the Euclidean distance, as well as multiple geodesic distances between atoms in a multi-graph. Second, we enable a multi-scale protein analysis by introducing hierarchical pooling operators, exploiting the fact that proteins are a recombination of a finite set of amino acids, which can be pooled using shared pooling matrices. Lastly, we evaluate the accuracy of our algorithms on several large-scale data sets for common protein analysis tasks, where we outperform state-of-the-art methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Protein-ligand binding affinity prediction | PDBbind Sequence Identity (60%) 2017 | RMSE1.473 | 10 | |
| Protein-ligand binding affinity prediction | PDBbind Sequence Identity (30%) 2017 | RMSE1.554 | 10 | |
| Enzyme-catalyzed reaction classification | Enzyme Commission (EC) numbers (test) | Reaction Class Accuracy87.2 | 9 | |
| Protein-ligand binding affinity prediction | PDBbind 2017 (Scaffold) | RMSE1.592 | 8 |