Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ICA: Information-Aware Credit Assignment for Visually Grounded Long-Horizon Information-Seeking Agents

About

Despite the strong performance achieved by reinforcement learning-trained information-seeking agents, learning in open-ended web environments remains severely constrained by low signal-to-noise feedback. Text-based parsers often discard layout semantics and introduce unstructured noise, while long-horizon training typically relies on sparse outcome rewards that obscure which retrieval actions actually matter. We propose a visual-native search framework that represents webpages as visual snapshots, allowing agents to leverage layout cues to quickly localize salient evidence and suppress distractors. To learn effectively from these high-dimensional observations, we introduce Information-Aware Credit Assignment (ICA), a post-hoc method that estimates each retrieved snapshot's contribution to the final outcome via posterior analysis and propagates dense learning signals back to key search turns. Integrated with a GRPO-based training pipeline, our approach consistently outperforms text-based baselines on diverse information-seeking benchmarks, providing evidence that visual snapshot grounding with information-level credit assignment alleviates the credit-assignment bottleneck in open-ended web environments. The code and datasets will be released in https://github.com/pc-inno/ICA_MM_deepsearch.git.

Cong Pang, Xuyu Feng, Yujie Yi, Zixuan Chen, Jiawei Hong, Tiankuo Yao, Nang Yuan, Jiapeng Luo, Lewei Lu, Xin Lou• 2026

Related benchmarks

TaskDatasetResultRank
Information SeekingBrowsecomp
Success Rate17.1
19
Information SeekingxBench-DS
Success Rate75
18
Information SeekingGAIA
Success Rate65
13
Information SeekingSEAL 0
Success Rate27
6
Showing 4 of 4 rows

Other info

Follow for update