Share your thoughts, 1 month free Claude Pro on usSee more

Long-context Classification on OOLONG (test)

58.1TREC-Q-coarse Accuracy

RLM + PEEK

Updated 2mo ago

Evaluation Results

Method	Links
RLM + PEEK 2026.05		58.1	69.4	57
RLM + ACE (Online Adaptation) 2026.05		48.8	61.6	42
RLM + Compaction Agent 2026.05		42	49.5	30
RLM + RAG 2026.05		36.6	63.1	29
RLM + Shared Chat 2026.05		32	49.6	23
RLM 2026.05		30.3	46.5	23