Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction

About

In drug discovery, predicting the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of small-molecule drugs is critical for ensuring safety and efficacy. However, the process of accurately predicting these properties is often resource-intensive and requires extensive experimental data. To address this challenge, we propose SMILES-Mamba, a two-stage model that leverages both unlabeled and labeled data through a combination of self-supervised pretraining and fine-tuning strategies. The model first pre-trains on a large corpus of unlabeled SMILES strings to capture the underlying chemical structure and relationships, before being fine-tuned on smaller, labeled datasets specific to ADMET tasks. Our results demonstrate that SMILES-Mamba exhibits competitive performance across 22 ADMET datasets, achieving the highest score in 14 tasks, highlighting the potential of self-supervised learning in improving molecular property prediction. This approach not only enhances prediction accuracy but also reduces the dependence on large, labeled datasets, offering a promising direction for future research in drug discovery.

Bohao Xu, Yingzhou Lu, Chenhao Li, Ling Yue, Xiao Wang, Tianfan Fu, Minjie Shen, Lulu Chen• 2024

Related benchmarks

TaskDatasetResultRank
Property PredictionCYP3A4
PR-AUC89.3
18
Property PredictionAMES
ROC-AUC0.801
18
ADMET Properties PredictionTDC Caco2 Wang
MAE0.438
12
ADMET Properties PredictionTDC PPBR AZ
MAE9.371
12
ADMET Properties PredictionTDC Half Life Obach
Spearman Correlation0.247
12
ADMET Properties PredictionTDC DILI
AUROC0.928
12
drug absorption property predictionPGP
ROC-AUC93
5
drug absorption property predictionBioav
ROC-AUC0.673
5
drug absorption property predictionAqsol
MAE0.819
5
Drug distribution property predictionBBB
ROC-AUC0.852
5
Showing 10 of 22 rows

Other info

Follow for update