Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MMAI Gym for Science: Training Liquid Foundation Models for Drug Discovery

About

General-purpose large language models (LLMs) that rely on in-context learning do not reliably deliver the scientific understanding and performance required for drug discovery tasks. Simply increasing model size or introducing reasoning tokens does not yield significant performance gains. To address this gap, we introduce the MMAI Gym for Science, a one-stop shop molecular data formats and modalities as well as task-specific reasoning, training, and benchmarking recipes designed to teach foundation models the 'language of molecules' in order to solve practical drug discovery problems. We use MMAI Gym to train an efficient Liquid Foundation Model (LFM) for these applications, demonstrating that smaller, purpose-trained foundation models can outperform substantially larger general-purpose or specialist models on molecular benchmarks. Across essential drug discovery tasks - including molecular optimization, ADMET property prediction, retrosynthesis, drug-target activity prediction, and functional group reasoning - the resulting model achieves near specialist-level performance and, in the majority of settings, surpasses larger models, while remaining more efficient and broadly applicable in the domain.

Maksim Kuznetsov, Zulfat Miftahutdinov, Rim Shayakhmetov, Mikolaj Mizera, Roman Schutski, Bogdan Zagribelnyy, Ivan Ilin, Nikita Bondarev, Thomas MacDougall, Mathieu Reymond, Mihir Bafna, Kaeli Kaymak-Loveless, Eugene Babin, Maxim Malkov, Mathias Lechner, Ramin Hasani, Alexander Amini, Vladimir Aladinskiy, Alex Aliper, Alex Zhavoronkov• 2026

Related benchmarks

TaskDatasetResultRank
Single-step retrosynthesisURSA expert 2026
Unique Rate94
21
Single-step retrosynthesisUSPTO-50k (test)--
18
Functional Group Reasoning (Binary Classification)FGBench Single
Accuracy84.1
16
Functional Group Reasoning (Binary Classification)FGBench Interaction
Accuracy81.9
16
Functional Group Reasoning (Binary Classification)FGBench Comparison
Accuracy81
16
Functional Group Reasoning (Numeric Regression)FGBench Single
RMSE55.954
16
Functional Group Reasoning (Numeric Regression)FGBench Interaction
RMSE25.046
16
Functional Group Reasoning (Numeric Regression)FGBench Comparison
RMSE48.344
16
ADMET Properties PredictionTDC Caco2 Wang
MAE0.347
12
ADMET Properties PredictionTDC PPBR AZ
MAE7.722
12
Showing 10 of 30 rows

Other info

Follow for update