Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

User Simulation on MirrorBench

0.713Realism Score (LLM-judge)

DITTO

0.34380.439650.53550.63135May 19, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.05
0.713
2026.05
0.683
2026.05
0.547
2026.05
0.536
2026.05
0.481
2026.05
0.358