Text-Augmented Multimodal LLMs for Chemical Reaction Condition Recommendation
About
Identifying reaction conditions that are broadly applicable across diverse substrates is a longstanding challenge in chemical and pharmaceutical research. While many methods are available to generate conditions with acceptable performance, a universal approach for reliably discovering effective conditions during reaction exploration is rare. Consequently, current reaction optimization processes are often labor-intensive, time-consuming, and costly, relying heavily on trial-and-error experimentation. Nowadays, large language models (LLMs) are capable of tackling chemistry-related problems, such as molecule design and chemical reasoning tasks. Here, we report the design, implementation and application of Chemma-RC, a text-augmented multimodal LLM to identify effective conditions through task-specific dialogue and condition generation. Chemma-RC learns a unified representation of chemical reactions by aligning multiple modalities-including text corpus, reaction SMILES, and reaction graphs-within a shared embedding module. Performance benchmarking on datasets showed high precision in identifying optimal conditions, with up to 17% improvement over the current state-of-the-art methods. A palladium-catalysed imidazole C-H arylation reaction was investigated experimentally to evaluate the functionalities of the Chemma-RC in practice. Our findings suggest that Chemma-RC holds significant potential to accelerate high-throughput condition screening in chemical synthesis.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Solvent 2 prediction | Private Dataset | Top-1 Similarity49.3 | 9 | |
| Reagent 1 prediction | Private Dataset | Top-1 Similarity55.7 | 9 | |
| Solvent 1 prediction | Private Dataset | Top-1 Similarity53.7 | 9 | |
| Catalyst prediction | Private Dataset | Top-1 Similarity43.4 | 9 | |
| Reagent 2 prediction | Private Dataset | Top-1 Similarity40.2 | 9 |