Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Two-state environment

Benchmarks

Task NameDataset NameSOTA ResultTrend
Off-policy predictionTwo-state environment
Steady-state AUC Error3.67
9
Linear off-policy predictionNew two-state environment
Max RMSE3.89
8
Linear off-policy predictionTwo-state environment
Max RMSE1.697
8
Showing 3 of 3 rows