Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

About

Understanding molecules is key to understanding organisms and driving advances in drug discovery, requiring interdisciplinary knowledge across chemistry and biology. Although large molecular language models have achieved notable success in task transfer, they often struggle to accurately analyze molecular features due to limited knowledge and reasoning capabilities. To address this issue, we present Mol-LLaMA, a large molecular language model that grasps the general knowledge centered on molecules and exhibits explainability and reasoning ability. To this end, we design key data types that encompass the fundamental molecular features, taking into account the essential abilities for molecular reasoning. Further, to improve molecular understanding, we propose a module that integrates complementary information from different molecular encoders, leveraging the distinct advantages of molecular representations. Our experimental results demonstrate that Mol-LLaMA is capable of comprehending the general features of molecules and providing informative responses, implying its potential as a general-purpose assistant for molecular analysis. Our project page is at https://mol-llama.github.io/.

Dongki Kim, Wonbin Lee, Sung Ju Hwang• 2025

Related benchmarks

TaskDatasetResultRank
Molecule CaptioningChEBI-20 (test)
BLEU-40.0206
107
Forward reaction predictionMol-Instructions--
24
Reagent PredictionMol-Instructions--
24
RetrosynthesisMol-Instructions--
24
Molecule CaptioningMol-Instructions
ROUGE-L0.759
17
Entity recognitionMol-Instructions
F1 Score74
13
Interaction ExtractionMol-Instructions
F1 Score21
13
Multi-ChoiceMol-Instructions
Accuracy88
13
True-or-FalseMol-Instructions
Accuracy60
13
Molecular DesignMol-Instructions
Validity95.3
13
Showing 10 of 13 rows

Other info

Follow for update