Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GUI Agent Attack Success Rate Evaluation on MIRAGE (1,111-sample main set)

41FB Success Rate

gpt-4o-mini

25.429.4533.537.55May 27, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.05
4121463022262332233930.2
2026.05
2924472327252432263928.9
2026.05
2825412429243032194028.6
2026.05
2719392226202225132523
2026.05
2620412622232529162925