Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Attention Operator Throughput on Llama2 7B (32 Q-heads/32 KV-heads/128 Head-dimension)

207.3Attention TFLOPS

flash-attn v2

3.04456.072109.1162.128Jun 14, 2025Aug 7, 2025Sep 30, 2025Nov 24, 2025Jan 17, 2026Mar 12, 2026May 6, 2026
Updated 27d ago

Evaluation Results

MethodLinks
2025.06
207.3----
2025.06
202.7----
2025.06
201.5----
2025.06
201.2----
2025.06
198.3----
2025.06
197.2----
2025.06
188.3----
2025.06
186.7----
2025.06
180.3----
2025.06
176.8----
2025.06
173.4----
2025.06
167.1----
2025.06
164.1----
2025.06
160.6----
2025.06
158.4----
2025.06
152.5----
2025.06
150.2----
2026.05
145.69----
2025.06
142.6----
2025.06
137.1----
2026.05
131.67----
2025.06
128.6----
2025.06
122.5----
2026.05
116.69----
2025.06
112.4----
2026.05
110.44----
2025.06
108.6----
2026.05
105.55----
2026.05
92.5----
2025.06
82.8----
2026.05
79.37----
2026.05
77.09----
2026.05
44.46----
2025.06
15.1----
2025.06
15.1----
2025.06
14.6----
2025.06
14.6----
2025.06
13.3----
2026.05
13.12----
2026.05
12.36----
2026.05
10.99----
2025.06
10.9----
2026.05
-27.2219.7225.4829.61
2026.05
-211.17319.03374.71404.09
2026.05
-81.07126.08175.6215.75
2026.05
-304.35414.05577.03551.73
2026.05
-23.659.2563.4
2026.05
-109.37111.62113.39115.7
2026.05
-65.4353.4338.7724.62
2026.05
-137.8176.88182.44151.9