Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

About

Foundation models have been transformational in machine learning fields such as natural language processing and computer vision. Similar success in atomic property prediction has been limited due to the challenges of training effective models across multiple chemical domains. To address this, we introduce Joint Multi-domain Pre-training (JMP), a supervised pre-training strategy that simultaneously trains on multiple datasets from different chemical domains, treating each dataset as a unique pre-training task within a multi-task framework. Our combined training dataset consists of $\sim$120M systems from OC20, OC22, ANI-1x, and Transition-1x. We evaluate performance and generalization by fine-tuning over a diverse set of downstream tasks and datasets including: QM9, rMD17, MatBench, QMOF, SPICE, and MD22. JMP demonstrates an average improvement of 59% over training from scratch, and matches or sets state-of-the-art on 34 out of 40 tasks. Our work highlights the potential of pre-training strategies that utilize diverse data to advance property prediction across chemical domains, especially for low-data tasks. Please visit https://nima.sh/jmp for further information.

Nima Shoghi, Adeesh Kolluru, John R. Kitchin, Zachary W. Ulissi, C. Lawrence Zitnick, Brandon M. Wood• 2023

Related benchmarks

TaskDatasetResultRank
Molecular property predictionQM9
Cv0.017
80
Band gap predictionMatbench MP Gap (Fold 0)
MAE (eV)0.089
14
Bandgap PredictionMatbench Bandgap
MAE (eV)0.091
12
Band gap predictionMatbench MP Gap Mean 0-4
MAE (eV)0.091
7
Material Property PredictionMatminer Exfoliation Energy 5-split average
MAE35.4
7
Material Property PredictionMatminer Elastic Anisotropy 5-split average
MAE2.42
7
Material Property PredictionMatminer 2D Dielectric Constant 5-split average
MAE2.25
7
Material Property PredictionMatminer 3D Poly Electronic 5-split average
MAE23.3
7
Material Property PredictionMatminer Poly Electronic 5-split average
MAE2.11
7
Material Property PredictionMatminer Poly Total 5-split average
MAE4.89
7
Showing 10 of 20 rows

Other info

Follow for update