Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Matrix game

Benchmarks

Task NameDataset NameSOTA ResultTrend
Value function estimationMatrix Game B v1 (test)
Q1 Value (A)6.2
4
Multi-objective offline reinforcement learningTwo-agent offline matrix game
Metric-
0
Showing 2 of 2 rows