| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Training Throughput | GPT2-1.5B | Throughput4.1 | 25 | |
| Training Data Attribution | GPT2-small | LDS Score0.3936 | 10 | |
| Output-based feature description faithfulness | GPT2 MLP SAE | Faithfulness Score40.9 | 8 | |
| Input-based feature description faithfulness | GPT2 MLP SAE | Faithfulness Score51.2 | 8 | |
| Output-based feature description faithfulness | GPT2 Res. SAE | Faithfulness Score47.2 | 8 | |
| Input-based feature description faithfulness | GPT2 Res. SAE | Faithfulness Score60.4 | 8 | |
| Private text generation | GPT2-base (124M) | Usage Fraction100 | 7 | |
| Private Inference | GPT2-base (124M) | Embed Inference Time (s)5.17 | 7 | |
| Feature Matching | GPT2 Layer 0 match with Layer 11 | LLM Eval Score1.39 | 6 | |
| Feature Matching | GPT2 Layer 5 match with Layer 11 | LLM Eval1.56 | 6 | |
| Adversarial Attack | GPT2 F.t. | ASR (%)74.25 | 6 | |
| Circuit Compression | GPT2-small Digit Addition | Accuracy68.12 | 5 | |
| Feature Matching | GPT2 Layer 5 match with Layer 6 | LLM Eval2.53 | 4 | |
| Sparse Probing | GPT2 Small | Average F174.3 | 4 | |
| Activation Reconstruction | GPT2 Small | MSE0.32 | 4 |