Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

User Simulation Quality Assessment on Session-level Human Evaluation Set adversarial (test)

86Win Rate

UserLM-R1

42.3253.666576.34Jan 14, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
862212
2026.01
553233
2026.01
442848