ChemATP: A Training-Free Chemical Reasoning Framework for Large Language Models
About
Large Language Models (LLMs) exhibit strong general reasoning but struggle in molecular science due to the lack of explicit chemical priors in standard string representations. Current solutions face a fundamental dilemma. Training-based methods inject priors into parameters, but this static coupling hinders rapid knowledge updates and often compromises the model's general reasoning capabilities. Conversely, existing training-free methods avoid these issues but rely on surface-level prompting, failing to provide the fine-grained atom-level priors essential for precise chemical reasoning. To address this issue, we introduce ChemATP, a framework that decouples chemical knowledge from the reasoning engine. By constructing the first atom-level textual knowledge base, ChemATP enables frozen LLMs to explicitly retrieve and reason over this information dynamically. This architecture ensures interpretability and adaptability while preserving the LLM's intrinsic general intelligence. Experiments show that ChemATP significantly outperforms training-free baselines and rivals state-of-the-art training-based models, demonstrating that explicit prior injection is a competitive alternative to implicit parameter updates.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Molecular property prediction | BBBP | ROC AUC0.8731 | 35 | |
| Molecular property prediction | BACE | ROC-AUC89.7 | 35 | |
| Molecular Classification | HIV | ROC-AUC72.7 | 35 | |
| Molecular property prediction | ClinTox | ROC AUC88.56 | 34 | |
| Molecular property prediction | SIDER | ROC AUC0.7179 | 21 | |
| Regression | FreeSolv | RMSE1.0177 | 20 | |
| Regression | MoleculeNet LIPO | RMSE0.7143 | 19 | |
| Molecular property prediction | Tox21 | ROC AUC80.23 | 17 | |
| Regression | ESOL | RMSE0.6504 | 9 | |
| Regression | Caco2 | RMSE0.5041 | 7 |