Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mind2Web

Benchmarks

Task NameDataset NameSOTA ResultTrend
Web agent tasksMind2Web Cross-Task
Element Accuracy59
57
Web agent tasksMind2Web (Cross-Website)
Element Accuracy57.2
40
Web agent tasksMind2Web Cross-Domain
Ele.Acc55.7
37
GUI NavigationMind2Web Cross-Task
Element Accuracy66
30
GUI NavigationMultimodal-Mind2Web Cross-Domain
Step Success Rate67.1
27
GUI NavigationMultimodal-Mind2Web Cross-Task
Step Success Rate71.5
27
Web NavigationMind2Web Cross-Domain
Element Accuracy (EA)53.25
26
Web Agent NavigationMind2Web Cross-Domain 1.0
Success Rate445
26
Web Agent NavigationMind2Web Cross-Task 1.0
Success Rate49.2
26
Web navigationMind2Web
Overall Success Rate58.7
26
Adversarial Attack against WebExperT agentMind2Web 600 tasks (test)
ASR (Finance, pass@10)46.2
24
Adversarial Attack against SeeAct agentMind2Web 600 tasks (test)
ASR Finance (pass@10)54.1
24
GUI NavigationMind2Web (Cross-Website)
Element Accuracy44.6
23
Web Navigation Task CompletionMind2Web Cross-task
Success Rate64.6
18
Web Navigation Task SuccessMIND2WEB ONLINE (test)
Task Success Rate (Overall)64.7
18
Web Agent NavigationMind2Web All 1.0
Element Accuracy0.484
16
Web Action Generation EfficiencyMind2Web (All)
Time to Proposal Steps363.6
16
Web Action Generation EfficiencyMind2Web Cross-Domain
To_Pro (Steps/Time)334.9
16
Web Action Generation EfficiencyMind2Web Cross-Website
To_Pro Steps/Time364.1
16
Web Action Generation EfficiencyMind2Web Cross-Task
Time to Procedure378.2
16
Web NavigationMind2Web Live (test)
Task Completion Rate52.8
16
Element GroundingMultimodal-Mind2Web Cross-Task
Element Accuracy50.7
16
Web Navigation Task CompletionMind2Web (Cross-website 177)
Success Rate66.2
14
Web NavigationMultimodal-Mind2Web Average
Avg. Step Success Rate54.3
14
Action PredictionMind2Web Cross-Domain
Operation F185.7
14
Showing 25 of 69 rows