Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Single-life task completion on Tabletop
Loading...
44,400
Avg Steps
QWALE
41,228
62,639
84,050
105,461
Oct 17, 2022
Avg Steps
Std Error
Success Rate
Median Steps
Updated 4d ago
Evaluation Results
Method
Method
Links
Avg Steps
Std Error
Success Rate
Median Steps
QWALE
mode=finetuning
2022.10
44,400
24,600
90
8,900
GAIL-sa
mode=finetuning
2022.10
61,500
28,700
70
2,400
GAIL-s
mode=finetuning
2022.10
83,200
23,800
80
75,600
SAC-RND
mode=finetuning
2022.10
94,800
26,900
70
51,400
SAC
mode=finetuning
2022.10
123,700
25,500
70
157,200
Feedback
Search any
task
Search any
task