Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SMolLM: Small Language Models Learn Small Molecular Grammar

About

Language models for molecular design have scaled to hundreds of millions of parameters, yet how they learn chemical grammar is poorly understood. We train SMolLM, a 53K-parameter weight-shared transformer, to generate novel SMILES with 95% validity on the ZINC-250K drug-like-molecule benchmark, outperforming a standard GPT with 10 times more parameters. Mechanistically, the same block resolves SMILES constraints across passes in a fixed hierarchy: brackets first, rings second, and valence last, as shown by error classification and linear probing, with ablation isolating the bracket-matching head. Together, these results yield a compact, mechanistically interpretable molecular generator and a testbed for studying iterative computation in formal-language domains.

Akhil Jindal, Harang Ju• 2026

Related benchmarks

TaskDatasetResultRank
Molecular GenerationZINC 250K
FCD2.45
45
Showing 1 of 1 rows

Other info

Follow for update