Share your thoughts, 1 month free Claude Pro on usSee more

Llama2

Benchmarks

Task Name	Dataset Name	SOTA Result
Attention Operator Throughput	Llama2 7B (32 Q-heads/32 KV-heads/128 Head-dimension)	Attention TFLOPS207.3	42
LLM Inference	LLaMA2 7B	TTFT (ms)11.09	33
Distributed Training	LLaMA2 13B	Number of Recomputed Layers0	21
Jailbreak attack	Llama2-7b five finetuned variants	Average ASR0	16
Accuracy	LLaMA2-7B zero-shot	Zero-Shot Accuracy67.18	16
Targeted Refusal	Llama2-7B Generation Evaluation Set	Completion Accuracy (CA)93.14	15
Sentiment Steering	Llama2-7B Generation Evaluation Set	Accuracy (CA)90.15	15
Multi-bit Watermarking	LLaMA2-7B 300 tokens (test)	Perplexity7.0486	14
Inference Efficiency	LLaMA2-7B 12/128 tokens	Latency1.889	13
Jailbreak Attack	llama2-7b v1 (pretrained)	ASR0	13
Watermark Detection	Llama2-7B Copy-paste attack	F1 Score97.8	11
LLM fingerprinting	Llama2 7B	AUC100	10
Watermark Detection	Llama2-7B Paragraphing	F1 Score91.6	8
Watermark Detection	Llama2-7B Synonymous substitution	F1 Score98.5	8
Watermark Detection	Llama2-7B Clean	F1 Score100	8
Efficiency Analysis	LLaMA2-7B	Memory Usage (GB)1.34	7
Jailbreak Defense	LLaMA2-7B Adaptive AutoDAN-T attack	ASR17	6
Jailbreak Defense	LLaMA2-7B Adaptive PAIR attack	Attack Success Rate (ASR)0	6
LLM Jailbreaking	Llama2-DA	SRF57	4
Relative Pos. Attention	Llama2-7b (q=32, k=32) (1k)	TFLOPS (Relative Pos. Attention)114.85	4
Share Question Mask Attention	Llama2-7b (q=32, k=32) (1k)	TFLOPS (Share QK Mask Attention, 1k)39.81	4
Global Sliding Window Attention	Llama2-7b q=32, k=32 (1k)	TFLOPS67.36	4
PrefixLM Attention	Llama2-7b (q=32, k=32) (8k)	TFLOPS (PrefixLM Attention)163.7	4
LLM Inference	LLaMA2-7B 1,024 tokens	Latency (ms)20	4
LLM Training	Llama2-70B (64 x H100-8)	Iteration Time (s)7.8	4

Showing 25 of 40 rows