Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Single-session-user on LongMemEval-S
Loading...
-
F1
No plottable results for F1 (PERCENT).
Metric
F1 (PERCENT)
Accuracy (PERCENT)
LLM-as-a-Judge Score (SCALAR)
Updated 4d ago
Evaluation Results
Method
Method
Links
F1
Accuracy
LLM-as-a-Judge Score
No evaluation results found.
Feedback
Search any
task
Search any
task