Protein Sequence and Structure Co-Design with Equivariant Translation
About
Proteins are macromolecules that perform essential functions in all living organisms. Designing novel proteins with specific structures and desired functions has been a long-standing challenge in the field of bioengineering. Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models, both of which suffer from high inference costs. In this paper, we propose a new approach capable of protein sequence and structure co-design, which iteratively translates both protein sequence and structure into the desired state from random initialization, based on context features given a priori. Our model consists of a trigonometry-aware encoder that reasons geometrical constraints and interactions from context features, and a roto-translation equivariant decoder that translates protein sequence and structure interdependently. Notably, all protein amino acids are updated in one shot in each translation step, which significantly accelerates the inference process. Experimental results across multiple tasks show that our model outperforms previous state-of-the-art baselines by a large margin, and is able to design proteins of high fidelity as regards both sequence and structure, with running time orders of magnitude less than sampling-based methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Generative Enzyme Design | Enzyme-substrate 323 fourth-level categories (test) | Average Score92 | 8 | |
| Substrate binding affinity prediction | Enzyme Family | Binding Affinity (EC 1.1.1) (kcal/mol)-5.99 | 8 | |
| Enzyme-substrate interaction scoring | BRENDA (test) | EC 1.1.1 Score0.54 | 5 | |
| Enzyme structural stability prediction | BRENDA 30 EC categories (test) | EC 1.1.1 Stability Score77.1 | 5 | |
| Enzyme-substrate binding affinity | BRENDA 30 EC categories (test) | EC 1.1.1 Score-6.61 | 5 | |
| Enzyme-substrate binding prediction | EnzyGen Evaluation Set (top-5 candidate) | EC 1.1.1 Performance94 | 4 | |
| Enzyme-substrate binding prediction | Enzyme Family candidates Top-10 | EC 1.1.1 Performance90 | 4 | |
| Enzyme Design | Enzyme fourth-level categories top-1 candidate (test) | Score 1.1.1-8.37 | 4 | |
| Enzyme Design Structural Stability Prediction | Enzyme Family (test) | EC 1.1.1 Score75.84 | 4 | |
| Generative Enzyme Design | Enzyme Families | Category 1.1.1 Score81.07 | 4 |