Structure-guided molecular design with contrastive 3D protein-ligand learning

About

Structure-based drug discovery faces the dual challenge of accurately capturing 3D protein-ligand interactions while navigating ultra-large chemical spaces to identify synthetically accessible candidates. In this work, we present a unified framework that addresses these challenges by combining contrastive 3D structure encoding with autoregressive molecular generation conditioned on commercial compound spaces. First, we introduce an SE(3)-equivariant transformer that encodes ligand and pocket structures into a shared embedding space via contrastive learning, achieving competitive results in zero-shot virtual screening. Second, we integrate these embeddings into a multimodal Chemical Language Model (MCLM). The model generates target-specific molecules conditioned on either pocket or ligand structures, with a learned dataset token that steers the output toward targeted chemical spaces, yielding candidates with favorable predicted binding properties across diverse targets.

Carles Navarro, Philipp Tholke, Gianni de Fabritiis• 2026

Related benchmarks

Task	Dataset	Result	Rank
Virtual Screening	LIT-PCBA (test)	AUROC53.76		17
Structure-Conditioned Molecular Design	LIT-PCBA 15 targets	Affinity Probability68		5

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord