Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Response Selection on PersonaMem
Loading...
64.36
Accuracy
TALLRec
45.9312
50.7156
55.5
60.2844
Jan 8, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
TALLRec
Inference Setting=Dire...
2026.01
64.36
GPT-OSS-20B
Inference Setting=Full...
2026.01
61.74
DeepSeek-R1-671B
Inference Setting=Full...
2026.01
61.44
Qwen3-8Bnon-thinking
Inference Setting=Dire...
2026.01
61.4
DeepSeek-R1-671B
Inference Setting=Stre...
2026.01
58.96
ALIGNXPLORE+
Inference Setting=Full...
2026.01
58.08
Qwen3-32Bthinking
Inference Setting=Full...
2026.01
57.36
GPT-OSS-20B
Inference Setting=Stre...
2026.01
54.82
ALIGNXPLORE+
Inference Setting=Stre...
2026.01
54.58
Qwen3-8Bthinking
Inference Setting=Full...
2026.01
54.36
ALIGNXPLORE
Inference Setting=Full...
2026.01
53.98
Qwen3-32Bthinking
Inference Setting=Stre...
2026.01
53.26
Qwen3-8Bthinking
Inference Setting=Stre...
2026.01
51.68
DS-R1-Distill-Qwen-7B
Inference Setting=Full...
2026.01
49.28
ALIGNXPLORE
Inference Setting=Stre...
2026.01
48.42
DS-R1-Distill-Qwen-7B
Inference Setting=Stre...
2026.01
46.64
Feedback
Search any
task
Search any
task