ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems
About
On LongMemEval-500, ZenBrain matches a long-context oracle's binary-judge accuracy to within 4.5 pp ($47.7\%$ vs. $52.2\%$; $91.3\%$) at $1/106^\text{th}$ of the per-query token cost (App. F.5-F.6, Fig. 2), and wins all 12 head-to-head answer-quality cells (4 systems $\times$ 3 LLM judges) against Letta, Mem0, and A-Mem under Bonferroni correction ($\alpha=0.05/18$, $p_\text{min}=6.2\times 10^{-31}$, $d \in [0.18, 0.52]$). ZenBrain is a 7-layer neuroscience-inspired memory architecture. The contribution is architectural integration: 15 validated neuroscience mechanisms unified under a single MemoryCoordinator -- 9 foundational algorithms (Two-Factor Synaptic KG, vmPFC-coupled FSRS, Simulation-Selection sleep, Bayesian confidence, and five more) plus 6 Predictive Memory Architecture components (NeuromodulatorEngine, ReconsolidationEngine, TripleCopyMemory, PriorityMap, StabilityProtector, MetacognitiveMonitor). No prior system integrates more than two. Stress ablation (60 days, Wilcoxon, 10 seeds) reveals a cooperative survival network: 9 of 15 mechanisms become individually critical ($\Delta Q$ up to $-93.7\%$), while moderate conditions mask individual contributions. Sim-Selection sleep adds 37% stability with 47.4% storage reduction ($p \le 5.1\times 10^{-3}$); TripleCopyMemory retains $S(t)=0.912$ at 30 days; multi-layer routing beats a flat baseline by $+20.7\%$ F1 on LoCoMo, $+19.5\%$ on MemoryArena. A cross-provider bias-direction check ($\Delta_\text{GPT-Anth}=-0.0001$ for ZB vs. $-0.049$ for Mem0) rules out LLM-judge-specific confounds. Open-source with 11,589 CI tests.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Long-context Question Answering | LongMemEval-S cross-benchmark replication Full-500 | Jaccard Score (S-4.5, 3x)50.4 | 4 | |
| Long-term memory evaluation | LongMemEval-S Full-500 official protocol | Mean Accuracy47.7 | 4 | |
| Information Retrieval | LongMemEval-S cross-benchmark replication Full-500 | Precision@50.674 | 4 | |
| Retrieval | LoCoMo real Retrieval post-G4 | Precision@50.081 | 4 |