Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Output-based feature description faithfulness on GPT2 MLP SAE
Loading...
40.9
Faithfulness Score
EnsembleR (VP+TC)
33.724
35.587
37.45
39.313
Jan 14, 2025
Faithfulness Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Faithfulness Score
EnsembleR (VP+TC)
SAE Width=32k, Layer A...
2025.01
40.9
EnsembleR (MA+TC)
SAE Width=32k, Layer A...
2025.01
40.3
VocabProj
SAE Width=32k, Layer A...
2025.01
38.3
EnsembleR (MA+VP)
SAE Width=32k, Layer A...
2025.01
38.1
EnsembleC (All)
SAE Width=32k, Layer A...
2025.01
37.2
EnsembleR (All)
SAE Width=32k, Layer A...
2025.01
37.1
TokenChange
SAE Width=32k, Layer A...
2025.01
36.5
MaxAct
SAE Width=32k, Layer A...
2025.01
34
Feedback
Search any
task
Search any
task