Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

pen

Benchmarks

Task NameDataset NameSOTA ResultTrend
Off-policy evaluation for classification errorpen
Bias-0.095
15
Offline-to-online Reinforcement Learningpen
Regret5.3
12
Trojan Attack (Target action: 'fixed random')Pen
Attack Success Rate (ASR)100
9
Trojan Attack (Target action: 'arithmetic')Pen
ASR0
9
Trojan Attack (Target action: '1')Pen
ASR100
9
Reinforcement Learningpen human
Normalized Return53.4
4
Reinforcement Learningpen cloned
Normalized Return58.9
4
Showing 7 of 7 rows