Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

InstructMol: Multi-Modal Integration for Building a Versatile and Reliable Molecular Assistant in Drug Discovery

About

The rapid evolution of artificial intelligence in drug discovery encounters challenges with generalization and extensive training, yet Large Language Models (LLMs) offer promise in reshaping interactions with complex molecular data. Our novel contribution, InstructMol, a multi-modal LLM, effectively aligns molecular structures with natural language via an instruction-tuning approach, utilizing a two-stage training strategy that adeptly combines limited domain-specific data with molecular and textual information. InstructMol showcases substantial performance improvements in drug discovery-related molecular tasks, surpassing leading LLMs and significantly reducing the gap with specialized models, thereby establishing a robust foundation for a versatile and dependable drug discovery assistant.

He Cao, Zijing Liu, Xingyu Lu, Yuan Yao, Yu Li• 2023

Related benchmarks

TaskDatasetResultRank
Molecular property predictionQM9 (test)--
229
Molecular property predictionMoleculeNet BBBP (scaffold)
ROC AUC72.4
140
Molecule CaptioningChEBI-20 (test)
METEOR0.509
114
Molecular property predictionMoleculeNet HIV (scaffold)
ROC AUC74
66
Molecular property predictionBACE (test)
ROC-AUC85.9
65
Molecular Property ClassificationMoleculeNet BBBP
ROC AUC72.4
56
Molecular property predictionBACE
ROC-AUC82.3
55
Molecular property predictionBBBP
ROC AUC0.7
48
Molecular property predictionClinTox
ROC AUC91.5
47
Molecular Property ClassificationMoleculeNet BACE
ROC AUC82.1
36
Showing 10 of 34 rows

Other info

Code

Follow for update