Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

User Simulation on Adversarial User Simulation Dataset Turn-level (test)

95Robotics Score

CharacterGLM

0.46425.00749.5574.093Jan 14, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
95-33.9737.61--
2026.01
86.4-38.341.55--
2026.01
78.6-38.343.05--
2026.01
66.4-42.9847.64--
2026.01
31.439.8648.7357.7376.6448.47
2026.01
14.155.5259.9869.6682.2761.55
2026.01
12.341.8252.8662.3980.1453.38
9.541.9152.6463.880.7353.75
6.449.0959.1169.6483.560.17
2026.01
4.573.3974.5280.592.5576.56
4.145.5757.1566.9282.9758.01