Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Llama 2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Watermark DetectionLlama-2-7b-chat-hf 10 samples UMD watermarking (test)
AUROC (t=0)1
64
Attention Operator LatencyLLaMA-2 Chat 7B
Attention Latency (ms)0.075
60
Safety EvaluationLLaMA-2-7B-CHAT Safety (test)
Safety Score0.55
60
Language ModelingLlama-2 13B
Perplexity (PPL)4.85
32
Jailbreak Attack TransferabilityLlama-2-7b-chat finetuned variants v1 (test)
Transfer Success Rate (TSR)60.4
16
Watermark Attack Success RateLlama-2-7b-chat-hf UMD watermarking (10 samples)
ASR100
15
LLM QuantizationLlama-2-70B
GPU Hours (h)2.2
13
LLM Inference VerificationLlama-2 7B
Verification Latency (s)0.17
12
Training Stability AnalysisLlama-2 7B pre-training
Number of Spikes0
9
Hybrid-Dimension ReconfigurationLLaMA-2 32B
Reconfiguration Time (s)0.253
8
Knowledge Distillation RobustnessLlama-2-7B teacher vs. llama-2-7b-logit-watermark-distill-kgw-k1-gamma0.25-delta2 student (test)
Similarity Score99.98
7
Model FingerprintingLlama-2 DPO 7B
Similarity Score99.94
7
Attribute SteeringLlama-2-7b-Chat-hf Open-Ended Generation
Wealth Score2.46
7
Ownership VerificationLlama-2-7B SFT & RLHF
FSR (Anchor)2
6
Ownership VerificationLlama-2-7B Taylor Pruning 5% sparsity
FSR0
6
Ownership VerificationLlama-2-7B Random Pruning, 10% sparsity
FSR0
6
Ownership VerificationLlama-2-7B Random Pruning, 5% sparsity
False Success Rate (FSR)0
6
Decoding LatencyLlama-2-7B 32k sequence length v1 (inference)
Decoding Latency (s)0.062
6
Decoding LatencyLlama-2-7B 16k sequence length v1 (inference)
Decoding Latency (s)0.041
6
Decoding ThroughputLlama 2 7B inference v1.0
Decoding Throughput (TOK/s)188
6
Language ModelingLLaMA-2 7B pre-training (val)
Validation Perplexity (40K steps)16.01
5
Model FingerprintingLLaMA-2 7B fine-tuned variants
U-test p-value0
5
LLM InferenceLLaMA-2 70B sequence length 2048
Max Batch Size384
5
Decoding LatencyLlama-2-7B 64k sequence length v1 (inference)
Decoding Latency (s)0.098
5
Decoding ThroughputLlama 2 70B v1.0 (inference)
Throughput (TOK/s)23.5
5
Showing 25 of 54 rows