Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

User Simulation Quality Assessment on Turn-level Human Evaluation Set adversarial (test)

168Win Rate

UserLM-R1

62.9690.23117.5144.77Jan 14, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
1682042
2026.01
1424038
2026.01
6711142