Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GPT-2 Concept Steering

Benchmarks

Task NameDataset NameSOTA ResultTrend
Concept SteeringGPT-2 Concept Steering 3 concepts x 4 layers x 25 contexts (test)
Median Off-target KL Ratio1.32
12
Showing 1 of 1 rows