Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Temporal Reasoning on LoCoMo (test)
Loading...
0.742
LLM Score
FullContext
0.25528
0.38164
0.508
0.63436
Jan 13, 2026
LLM Score
F1 Score
BLEU-1
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
F1 Score
BLEU-1
FullContext
LLM=GPT-4.1-mini
2026.01
0.742
47.5
40
Nemori
LLM=GPT-4.1-mini
2026.01
0.735
58.5
50.1
SwiftMem
LLM=GPT-4.1-mini
2026.01
0.685
50.7
56.9
Zep
LLM=GPT-4.1-mini
2026.01
0.602
23.9
20
Mem0
LLM=GPT-4.1-mini
2026.01
0.569
39.2
33.2
LangMem
LLM=GPT-4.1-mini
2026.01
0.508
48.5
40.9
RAG
LLM=GPT-4.1-mini
2026.01
0.274
22.3
19.1
Feedback
Search any
task
Search any
task