GLOVE: Global Verifier for LLM Memory-Environment Realignment

About

Most existing memory-enhanced Large Language Model (LLM) approaches implicitly assume that memory validity can be established either through external evaluators that provide task-specific success signals or through internal model cognition, such as reflection, for editing memory entries. However, these assumptions often break down in practical environments with dynamic drifts. We propose the Global Verifier (GLOVE), a framework that introduces a new design dimension for LLM memory systems by establishing a relative notion of truth. Through active probing to detect inconsistencies between retrieved memories and fresh observations, GLOVE enables memory-environment realignment by verifying and updating memory without access to ground-truth supervision or strong reliance on model introspection. We evaluate GLOVE on diverse benchmarks spanning web navigation, planning, and control, augmented with controlled environmental drifts that introduce non-stationarity beyond the original benchmark settings. Our results show that GLOVE substantially improves agent success rates, suggesting a robust pathway to cognitive agents capable of self-evolving.

Xingkun Yin, Hongyang Du• 2026

Related benchmarks

Task	Dataset	Result
Web Navigation and Shopping	Webshop	Score100	248
E-commerce Navigation and Search	WebShop semantic shift Hidden drift	Score100	63
Grid-world Navigation	FrozenLake reward reversal Hidden drift	Score100	45
Grid-world Navigation	FrozenLake (Source)	Score67.5	36
Classic Control	MountainCar v1.0 (Drift I)	Success Rate100	27
Classic Control	MountainCar v1.0 (Drift II)	Success Rate100	27
Web navigation	WebShop Source	Success Rate100	27
Continuous Control	MountainCar Source	Success Rate100	27
Gridworld Navigation	FrozenLake (Source)	Success Rate8.50e+3	27
E-commerce Navigation and Search	WebShop semantic shift Source	Score1	18

Showing 10 of 49 rows

Other info

Follow for update

@wizwand_team Discord