Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Generative Enzyme Design Guided by Functionally Important Sites and Small-Molecule Substrates

About

Enzymes are genetically encoded biocatalysts capable of accelerating chemical reactions. How can we automatically design functional enzymes? In this paper, we propose EnzyGen, an approach to learn a unified model to design enzymes across all functional families. Our key idea is to generate an enzyme's amino acid sequence and their three-dimensional (3D) coordinates based on functionally important sites and substrates corresponding to a desired catalytic function. These sites are automatically mined from enzyme databases. EnzyGen consists of a novel interleaving network of attention and neighborhood equivariant layers, which captures both long-range correlation in an entire protein sequence and local influence from nearest amino acids in 3D space. To learn the generative model, we devise a joint training objective, including a sequence generation loss, a position prediction loss and an enzyme-substrate interaction loss. We further construct EnzyBench, a dataset with 3157 enzyme families, covering all available enzymes within the protein data bank (PDB). Experimental results show that our EnzyGen consistently achieves the best performance across all 323 testing families, surpassing the best baseline by 10.79% in terms of substrate binding affinity. These findings demonstrate EnzyGen's superior capability in designing well-folded and effective enzymes binding to specific substrates with high affinities.

Zhenqiao Song, Yunlong Zhao, Wenxian Shi, Wengong Jin, Yang Yang, Lei Li• 2024

Related benchmarks

TaskDatasetResultRank
Generative Enzyme DesignEnzyme-substrate 323 fourth-level categories (test)
Average Score97
8
Substrate binding affinity predictionEnzyme Family
Binding Affinity (EC 1.1.1) (kcal/mol)-14.11
8
Enzyme structural stability predictionBRENDA 30 EC categories (test)
EC 1.1.1 Stability Score91.86
5
Enzyme-substrate binding affinityBRENDA 30 EC categories (test)
EC 1.1.1 Score-8.44
5
Enzyme-substrate interaction scoringBRENDA (test)
EC 1.1.1 Score0.66
5
Enzyme DesignEnzyme fourth-level categories top-1 candidate (test)
Score 1.1.1-16.63
4
Enzyme-substrate binding predictionEnzyGen Evaluation Set (top-5 candidate)
EC 1.1.1 Performance97
4
Enzyme-substrate binding predictionEnzyme Family candidates Top-10
EC 1.1.1 Performance95
4
Enzyme Design Structural Stability PredictionEnzyme Family (test)
EC 1.1.1 Score90.05
4
Generative Enzyme DesignEnzyme Families
Category 1.1.1 Score92.85
4
Showing 10 of 10 rows

Other info

Follow for update