Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Llama2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Attention Operator ThroughputLlama2 7B (32 Q-heads/32 KV-heads/128 Head-dimension)
Attention TFLOPS207.3
42
LLM InferenceLLaMA2 7B
TTFT (ms)11.09
33
Distributed TrainingLLaMA2 13B
Number of Recomputed Layers0
21
Jailbreak attackLlama2-7b five finetuned variants
Average ASR0
16
AccuracyLLaMA2-7B zero-shot
Zero-Shot Accuracy67.18
16
Targeted RefusalLlama2-7B Generation Evaluation Set
Completion Accuracy (CA)93.14
15
Sentiment SteeringLlama2-7B Generation Evaluation Set
Accuracy (CA)90.15
15
Multi-bit WatermarkingLLaMA2-7B 300 tokens (test)
Perplexity7.0486
14
Inference EfficiencyLLaMA2-7B 12/128 tokens
Latency1.889
13
Jailbreak Attackllama2-7b v1 (pretrained)
ASR0
13
Watermark DetectionLlama2-7B Copy-paste attack
F1 Score97.8
11
LLM fingerprintingLlama2 7B
AUC100
10
Watermark DetectionLlama2-7B Paragraphing
F1 Score91.6
8
Watermark DetectionLlama2-7B Synonymous substitution
F1 Score98.5
8
Watermark DetectionLlama2-7B Clean
F1 Score100
8
Efficiency AnalysisLLaMA2-7B
Memory Usage (GB)1.34
7
Jailbreak DefenseLLaMA2-7B Adaptive AutoDAN-T attack
ASR17
6
Jailbreak DefenseLLaMA2-7B Adaptive PAIR attack
Attack Success Rate (ASR)0
6
LLM JailbreakingLlama2-DA
SRF57
4
Relative Pos. AttentionLlama2-7b (q=32, k=32) (1k)
TFLOPS (Relative Pos. Attention)114.85
4
Share Question Mask AttentionLlama2-7b (q=32, k=32) (1k)
TFLOPS (Share QK Mask Attention, 1k)39.81
4
Global Sliding Window AttentionLlama2-7b q=32, k=32 (1k)
TFLOPS67.36
4
PrefixLM AttentionLlama2-7b (q=32, k=32) (8k)
TFLOPS (PrefixLM Attention)163.7
4
LLM InferenceLLaMA2-7B 1,024 tokens
Latency (ms)20
4
LLM TrainingLlama2-70B (64 x H100-8)
Iteration Time (s)7.8
4
Showing 25 of 40 rows