Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Transformer

Benchmarks

Task NameDataset NameSOTA ResultTrend
Transformer Inference12-layer Transformer 1024 tokens (inference)
Speedup342.74
24
Jacobian computationTransformer
Median Runtime (ms)0.28
8
Scaling EfficiencyTransformer 128 tokens
Scaling Efficiency (Linear Projection)93.29
5
Finding Optimal Elimination OrderTransformer
Number of Multiplications4,656
5
Showing 4 of 4 rows