Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mind2Web

Benchmarks

Task NameDataset NameSOTA ResultTrend
Web agent tasksMind2Web Cross-Task
Element Accuracy59
49
Web agent tasksMind2Web (Cross-Website)
Element Accuracy57.2
40
Web agent tasksMind2Web Cross-Domain
Ele.Acc55.7
37
GUI NavigationMultimodal-Mind2Web Cross-Domain
Step Success Rate67.1
27
GUI NavigationMultimodal-Mind2Web Cross-Task
Step Success Rate71.5
27
GUI NavigationMind2Web (Cross-Website)
Element Accuracy44.6
23
Web Agent NavigationMind2Web All 1.0
Element Accuracy0.484
16
Web Agent NavigationMind2Web Cross-Domain 1.0
Element Accuracy47.4
16
Web Agent NavigationMind2Web Cross-Task 1.0
Element Accuracy54.9
16
Web Action Generation EfficiencyMind2Web (All)
Time to Proposal Steps363.6
16
Web Action Generation EfficiencyMind2Web Cross-Domain
To_Pro (Steps/Time)334.9
16
Web Action Generation EfficiencyMind2Web Cross-Website
To_Pro Steps/Time364.1
16
Web Action Generation EfficiencyMind2Web Cross-Task
Time to Procedure378.2
16
Web Navigation Task SuccessMIND2WEB ONLINE (test)
Task Success Rate (Easy)84
16
Web NavigationMind2Web Live (test)
Task Completion Rate52.8
16
Element GroundingMultimodal-Mind2Web Cross-Task
Element Accuracy50.7
16
Web NavigationMultimodal-Mind2Web Average
Avg. Step Success Rate54.3
14
Action PredictionMind2Web Cross-Domain
Operation F185.7
14
Action PredictionMind2Web Cross-Task
Operation F1 Score87.6
14
Reward ModelingMind2Web
Pairwise Acc97.07
13
GUI NavigationMind2Web Cross-Domain
Element Accuracy35.7
12
GUI NavigationMind2Web Cross-Task
Element Accuracy36.9
12
Conversational web navigationMT-Mind2Web Cross-Subdomain
Element Accuracy52
12
Conversational web navigationMT-Mind2Web (Cross-Website)
Element Accuracy48.3
12
Web Browsing Action PredictionMind2Web (Cross-Website)
Operation F184.8
11
Showing 25 of 47 rows