Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline Meta-Reinforcement Learning on Walker-Rand-Params sampled 10 unseen (test)
Loading...
344.2
Average Return
CSRO
241.864
268.432
295
321.568
Nov 7, 2023
Average Return
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Return
CSRO
non-prior context coll...
2023.11
344.2
CSRO
non-prior context coll...
2023.11
319.7
CORRO
non-prior context coll...
2023.11
312.5
OffPearl
non-prior context coll...
2023.11
284.5
CORRO
non-prior context coll...
2023.11
275.2
OffPearl
non-prior context coll...
2023.11
262
BOREL
non-prior context coll...
2023.11
260.6
FOCAL
non-prior context coll...
2023.11
253.3
FOCAL
non-prior context coll...
2023.11
247.5
BOREL
non-prior context coll...
2023.11
245.8
Feedback
Search any
task
Search any
task