Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Autoregressive Language Modeling on WikiText-103 (first 10M tokens)
Loading...
90.5
Perplexity (PPL)
TF-GPT
90.032
93.191
96.35
99.509
Apr 9, 2026
Perplexity (PPL)
Relative Change vs TF (%)
Updated 9d ago
Evaluation Results
Method
Method
Links
Perplexity (PPL)
Relative Change vs TF (%)
TF-GPT
Inference Cache=O(N)
2026.04
90.5
-
MIPT+Cache
K=64, Inference Cache=...
2026.04
92.1
1.8
MIPT+Cache
K=16, Inference Cache=...
2026.04
96.3
6.4
MIPT+Cache
K=8, Inference Cache=O(8)
2026.04
98.1
8.4
MIPT-LM
Inference Cache=O(1)
2026.04
102.2
12.9
Feedback
Search any
task
Search any
task