Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Activations

Benchmarks

Task NameDataset NameSOTA ResultTrend
Downstream Utility EvaluationLLM Activations
Sparse Probing Accuracy87.9
8
Feature InterpretabilityLLM Activations
AutoInterp Score86.9
8
Hierarchical Feature AlignmentLLM Activations
Absorption98.8
8
Sparse ReconstructionLLM Activations
L049.4
8
Showing 4 of 4 rows