Share your thoughts, 1 month free Claude Pro on us
See more
Feedback
Search any
task
Search any
task
SOTA Decision Making benchmarks and papers with code | Wizwand
Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Tasks
Decision Making
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
AlfWorld
Teacher (LLaMA-13B)
Steps
6.4
22
1mo ago
WebShop
Dual-Process (AUQ)
Success Rate
42.9
15
1mo ago
30 synthetic decision-making rounds (evaluation)
Lark Full
Mean Rank
2.55
14
11d ago
Decision Making OOD
TIMEOMNI-1
ACC
58.9
13
1mo ago
Decision Making (ID)
TIMEOMNI-1
Accuracy
47.9
13
1mo ago
TSR-Suite Task 4
TIMEOMNI-VL
Accuracy
61.4
8
1mo ago
OAS (test)
LP
Timely Score
0.5037
7
1mo ago
GlobalStore (test)
LP
Timeliness Score
35.52
7
1mo ago
DataCo (test)
S2A
Timely Score
0.5447
7
1mo ago
OR-ShARC (test)
EFT
Micro Aggregation Score
0.785
7
1mo ago
OR-ShARC (dev)
EFT
Micro Avg
83.4
7
1mo ago
Deliberative decision-making tasks n=45 (overall)
DCI
Mean Tokens
237,565
5
1mo ago
Pandora's Box
Oracle policy
Optimal Match Rate
1
5
1mo ago
SinerGym (test)
Vintix II
Normalized Score
92
4
10d ago
MuJoCo (test)
Vintix II
Normalized Score
1
4
10d ago
MetaDrive (test)
Vintix
Normalized Score
1.02
4
10d ago
Meta-World (test)
Vintix II
Normalized Score
69
4
10d ago
Kinetix (test)
Vintix II
Normalized Score
0.23
4
10d ago
Industrial Benchmark (test)
Vintix II
Normalized Score
0.98
4
10d ago
HumEnv (test)
Vintix II
Normalized Score
9
4
10d ago
ControlGym (test)
Vintix
Normalized Score
100
4
10d ago
CityLearn (test)
Vintix II
Normalized Score
78
4
10d ago
Bi-DexHands (test)
Vintix
Normalized Score
0.45
4
10d ago
StarCraft II (SC2) built-in AI LV7 (VeryHard) (test)
StarWM-Agent
Win Rate
50
4
1mo ago
StarCraft II built-in AI LV6 (Harder) (test)
StarWM-Agent
Win Rate
40
4
1mo ago
Showing 25 of 36 rows
25 / page
50 / page
100 / page
1
2
Search any
task
Search any
task
Privacy Policy
Terms of Service
FAQs
Swarm Docs