MolLingo: Molecule-Native Representations for LLM-Powered Scientific Agents

About

We present MolLingo, a multi-agent system that emulates the reasoning process of a chemist to automate molecular design. Existing LLM-based approaches either operate as standalone generative models without access to external tools or lack the multi-agent coordination and shared memory needed for iterative, evidence-driven reasoning across the molecular design pipeline. MolLingo addresses this by coordinating a Literature Agent, a Chemist Agent, and an Orchestrator through a shared memory module, with each agent equipped with domain-specific tools. To enable effective molecular reasoning, we introduce BRICS-based Fragment Enumeration (BFE), a synthesis-aware molecular fragmentation method that decomposes molecules into chemically meaningful building blocks represented as block-based SMILES paired with common chemical names. This representation bridges molecular structure and LLM semantic space, enabling block-level reasoning and editing that is difficult with raw SMILES alone. As a case study in early-stage therapeutic design, MolLingo further grounds the Chemist Agent's reasoning in binding site geometry and residue-level protein context derived from molecular docking to optimize molecules for stronger target binding. Across four benchmarks, MolLingo consistently outperforms frontier LLMs and specialized baselines, including a fourfold docking score improvement over GPT-5.4 despite using the same underlying model, consistent drug property optimization gains across multiple LLM backbones, and state-of-the-art results on TOMG-Bench, surpassing both frontier LLMs and the RL-based optimization method RePO. Our results suggest that LLMs are already capable molecular design assistants when guided through chemically meaningful representations and biologically grounded structural context. Code is available at: https://anonymous.4open.science/status/MolLingo-7450.

Thao Nguyen, Heng Ji• 2026

Related benchmarks

Task	Dataset	Result
Molecular Optimization (LogP)	TOMG-Bench	Success Rate (SR)96.8	39
Molecular Optimization (MR)	TOMG-Bench	Success Rate (SR)99.8	39
Molecular Optimization (QED)	TOMG-Bench	Success Rate (SR)76.5	39
Molecular Optimization	DILI TDC (test)	Improvement (%)18.451	8
Molecular Optimization	hERG withdrawn drugs	Improvement (%)5.37	8
Hit-to-lead progression	Hit-to-lead 30 protein targets (test)	Validity100	7
Hit Identification	AKT1 (top 30 hits)	Docking Score-6.368	2
Hit Identification	CDK2 (top 30 hits)	Docking Score-9.308	2

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord