Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MolSight: Molecular Property Prediction with Images

About

Every molecule ever synthesised can be drawn as a 2D skeletal diagram, yet in modern property prediction this universally available representation has received less focus in favour of molecular graphs, 3D conformers, or billion-parameter language models, each imposing its own computational and data-engineering overhead. We present $\textbf{MolSight}$, the first systematic large-scale study of vision-based Molecular Property Prediction (MPP). Using 10 vision architectures, 7 pre-training strategies, and $2\,M$ molecule images, we evaluate performance across 10 downstream tasks spanning physical-property regression, drug-discovery classification, and quantum-chemistry prediction. To account for the wide variation in structural complexity across pre-training molecules, we further propose a $\textbf{chemistry-informed curriculum}$: five structural complexity descriptors partition the corpus into five tiers of increasing chemical difficulty, consistently outperforming non-curriculum baselines. We show that a single rendered bond-line image, processed by a vision encoder, is sufficient for competitive molecular property prediction, i.e. $\textit{chemical insight from sight alone}$. The best curriculum-trained configuration achieves the top result on $\textbf{5 of 10}$ benchmarks and top two on $\textbf{all 10}$, at $\textbf{$\textit{80$\times$ lower}$}$ FLOPs than the nearest multi-modal competitor.

Aaditya Baranwal, Akshaj Gupta, Shruti Vyas, Yogesh S Rawat• 2026

Related benchmarks

TaskDatasetResultRank
Molecular property predictionQM9 (test)--
245
Molecular property predictionBBBP (test)
ROC-AUC0.893
94
Molecular property predictionBACE
ROC-AUC89
73
Molecular property predictionTox21
ROC AUC90.7
47
Molecular Property Prediction (Regression)ESOL
RMSE0.515
44
Quantum Property PredictionQM9
HOMO-LUMO Gap (Delta_epsilon)0.007
42
Molecular ClassificationHIV
ROC-AUC83.2
42
ClassificationBBBP
ROC-AUC0.937
39
RegressionQM9
Gap Energy Error (delta_e)0.007
18
Toxicity Property PredictionLD50
MAE0.369
11
Showing 10 of 16 rows

Other info

Follow for update