Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

About

Protein language models (PLMs) are passive oracles: they generate sequences in a single forward pass with no mechanism to consult external biophysical feedback or redirect generation when a candidate violates thermodynamic or structural constraints. We introduce AgentPLM, which addresses this by equipping a pre-trained PLM with i) Reasoning-Augmented Decoding (RAD), which interleaves autoregressive generation with tool calls (ESMFold, FoldX, AutoDock Vina), and ii) Contrastive Agent Policy Optimisation (CAPO), a trajectory-level extension of direct preference optimisation that trains the policy end-to-end to learn when oracle feedback is informative rather than merely imitating high-fitness sequences. We evaluate AgentPLM on benchmark tasks spanning de novo enzyme design, antibody optimisation, thermostability, PPI interface design, and zero-shot fitness prediction with standardised oracle APIs and controlled sequence-identity splits. AgentPLM achieves state-of-the-art results with a gain in antibody top-10% hit rate over the strongest passive baseline, providing mechanistic evidence of online error correction without explicit backtracking.

Sahil Rahman, Maxx Richard Rahman• 2026

Related benchmarks

TaskDatasetResultRank
Antibody optimizationAntibodyOpt
Top-10% Hit Rate52.41
6
Enzyme DesignEnzyDes
Normalised kcat/Km Ratio1.89
6
Thermostability improvementThermoStab
Mean Tm Improvement (°C)7.64
6
Zero-shot fitness predictionDMS
Spearman Correlation0.61
6
Showing 4 of 4 rows

Other info

Follow for update