Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Task Completion on Synthetic personalized interaction datasets (evaluation)

8.48Task Completion Score

History-augmented prompting

7.37767.66387.958.2362Feb 12, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
8.48-
2026.02
8.48-
2026.02
8.48-
2026.02
8.48-
2026.02
8.48-
2026.02
8.260.22
2026.02
7.940.54
2026.02
7.940.54
2026.02
7.890.59
2026.02
7.421.06