Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Input-based feature description faithfulness on GPT2 MLP SAE

51.2Faithfulness Score

EnsembleR (MA+VP)

4.29616.47328.6540.827Jan 14, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.01
51.2
2025.01
51.1
2025.01
50.2
2025.01
39.7
2025.01
24.4
2025.01
7.1
2025.01
6.3
2025.01
6.1