Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Transformer

Benchmarks

Task NameDataset NameSOTA ResultTrend
Jacobian computationTransformer
Median Runtime (ms)0.28
8
Scaling EfficiencyTransformer 128 tokens
Scaling Efficiency (Linear Projection)93.29
5
Finding Optimal Elimination OrderTransformer
Number of Multiplications4,656
5
Showing 3 of 3 rows