Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Persona-based memory dialogue on PersonaMem
Loading...
65.2
Normalized Score
Qwen3-Max-Thinking
41.8832
47.9366
53.99
60.0434
Mar 24, 2026
Normalized Score
Discriminability
Updated 24d ago
Evaluation Results
Method
Method
Links
Normalized Score
Discriminability
Qwen3-Max-Thinking
formatting=multi-turn,...
2026.03
65.2
15
DeepSeek-V3.2
formatting=multi-turn,...
2026.03
64.52
15
GLM-5
formatting=multi-turn,...
2026.03
63.16
15
Kimi-K2.5
formatting=multi-turn,...
2026.03
55.35
15
MiniMax-M2.5
formatting=multi-turn,...
2026.03
42.78
15
Feedback
Search any
task
Search any
task