Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mind2Web

Benchmarks

Task NameDataset NameSOTA ResultTrend
Web agent tasksMind2Web Cross-Task
Step Success Rate53.2
64
Web Navigation Task SuccessMIND2WEB ONLINE (test)
Task Success Rate (Overall)67
41
Web navigationMind2Web
Overall Success Rate58.7
41
Web agent tasksMind2Web (Cross-Website)
Element Accuracy57.2
40
GUI Web Agent NavigationMind2Web Online
Overall Average Score67.3
37
Web NavigationMind2Web Cross-Domain
Element Accuracy (EA)65.2
37
Web agent tasksMind2Web Cross-Domain
Ele.Acc55.7
37
GUI NavigationMultimodal-Mind2Web Cross-Domain
Step Success Rate67.1
32
GUI NavigationMultimodal-Mind2Web Cross-Task
Step Success Rate71.5
32
GUI NavigationMind2Web Cross-Task
Element Accuracy66
30
Web Agent NavigationMind2Web Cross-Domain 1.0
Success Rate445
26
Web Agent NavigationMind2Web Cross-Task 1.0
Success Rate49.2
26
GUI Agent NavigationMind2Web
Success Rate45.68
24
Adversarial Attack against WebExperT agentMind2Web 600 tasks (test)
ASR (Finance, pass@10)46.2
24
Adversarial Attack against SeeAct agentMind2Web 600 tasks (test)
ASR Finance (pass@10)54.1
24
GUI NavigationMind2Web (Cross-Website)
Element Accuracy44.6
23
Web NavigationMM-Mind2Web
Step Success Rate (SR)22.97
22
Web Navigation Task CompletionMind2Web Cross-task
Success Rate64.6
18
Web Agent NavigationMind2Web All 1.0
Element Accuracy0.484
16
Web Action Generation EfficiencyMind2Web (All)
Time to Proposal Steps363.6
16
Web Action Generation EfficiencyMind2Web Cross-Domain
To_Pro (Steps/Time)334.9
16
Web Action Generation EfficiencyMind2Web Cross-Website
To_Pro Steps/Time364.1
16
Web Action Generation EfficiencyMind2Web Cross-Task
Time to Procedure378.2
16
Web NavigationMind2Web Live (test)
Task Completion Rate52.8
16
Element GroundingMultimodal-Mind2Web Cross-Task
Element Accuracy50.7
16
Showing 25 of 86 rows