Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MHA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-Head Attention (MHA)MHA causal mask head dimension 128 FP8 on NVIDIA L40S GPU (test)
Performance (TFLOPS)257.9
6
Multi-Head Attention Operator DevelopmentMHA (hd=64, sl=1024) on A100 GPU
TFLOPS175.6
2
Showing 2 of 2 rows