Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

A Study of Adaptive Modeling Towards Robust Generalization

About

Large language models (LLMs) increasingly support reasoning over biomolecular structures, but most existing approaches remain modality-specific and rely on either sequence-style encodings or fixed-length connector tokens for structural inputs. These designs can under-expose explicit geometric cues and impose rigid fusion bottlenecks, leading to over-compression and poor token allocation as structural complexity grows. We present a unified all-atom framework that grounds language reasoning in geometric information while adaptively scaling structural tokens. The method first constructs variable-size structural patches on molecular graphs using an instruction-conditioned gating policy, enabling complexity-aware allocation of query tokens. It then refines the resulting patch tokens via cross-attention with modality embeddings and injects geometry-informed tokens into the language model to improve structure grounding and reduce structural hallucinations. Across diverse all-atom benchmarks, the proposed approach yields consistent gains in heterogeneous structure-grounded reasoning. An anonymized implementation is provided in the supplementary material.

Zihao Jing, Qiuhao Zeng, Ruiyi Fang, Yan Yi Li, Yan Sun, Boyu Wang, Pingzhao Hu• 2026

Related benchmarks

TaskDatasetResultRank
Forward reaction predictionMol-Instructions--
24
Reagent PredictionMol-Instructions--
24
RetrosynthesisMol-Instructions--
24
Molecule CaptioningMol-Instructions
ROUGE-L0.766
17
Multimodal ReasoningGEO-AT Molecule
METEOR0.415
17
Multimodal ReasoningGEO-AT Protein
METEOR41.7
17
Multimodal ReasoningGEO-AT DNA
METEOR52.9
17
Multimodal ReasoningGEO-AT RNA
METEOR0.491
17
Entity recognitionMol-Instructions
F1 Score78
13
Interaction ExtractionMol-Instructions
F1 Score27
13
Showing 10 of 23 rows

Other info

Follow for update