Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Output-based feature description faithfulness on GPT2 Res. SAE

47.2Faithfulness Score

EnsembleR (MA+VP)

42.62443.8124546.188Jan 14, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.01
47.2
2025.01
47.2
2025.01
47.2
2025.01
46.9
2025.01
44.2
2025.01
44.1
2025.01
43.4
2025.01
42.8