Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

GLOVE: Global Verifier for LLM Memory-Environment Realignment

About

Most existing memory-enhanced Large Language Model (LLM) approaches implicitly assume that memory validity can be established either through external evaluators that provide task-specific success signals or through internal model cognition, such as reflection, for editing memory entries. However, these assumptions often break down in practical environments with dynamic drifts. We propose the Global Verifier (GLOVE), a framework that introduces a new design dimension for LLM memory systems by establishing a relative notion of truth. Through active probing to detect inconsistencies between retrieved memories and fresh observations, GLOVE enables memory-environment realignment by verifying and updating memory without access to ground-truth supervision or strong reliance on model introspection. We evaluate GLOVE on diverse benchmarks spanning web navigation, planning, and control, augmented with controlled environmental drifts that introduce non-stationarity beyond the original benchmark settings. Our results show that GLOVE substantially improves agent success rates, suggesting a robust pathway to cognitive agents capable of self-evolving.

Xingkun Yin, Hongyang Du• 2026

Related benchmarks

TaskDatasetResultRank
E-commerce Navigation and SearchWebShop semantic shift Hidden drift
Score100
63
Grid-world NavigationFrozenLake reward reversal Hidden drift
Score100
45
Grid-world NavigationFrozenLake (Source)
Score67.5
36
Web Navigation and ShoppingWebshop--
33
Classic ControlMountainCar v1.0 (Drift I)
Success Rate100
27
Classic ControlMountainCar v1.0 (Drift II)
Success Rate100
27
Web navigationWebShop Source
Success Rate100
27
Continuous ControlMountainCar Source
Success Rate100
27
Gridworld NavigationFrozenLake (Source)
Success Rate8.50e+3
27
E-commerce Navigation and SearchWebShop semantic shift Source
Score1
18
Showing 10 of 49 rows

Other info

Follow for update