Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Llama 2

Benchmarks

Task NameDataset NameSOTA ResultTrend
Watermark DetectionLlama-2-7b-chat-hf 10 samples UMD watermarking (test)
AUROC (t=0)1
64
Attention Operator LatencyLLaMA-2 Chat 7B
Attention Latency (ms)0.075
60
Safety EvaluationLLaMA-2-7B-CHAT Safety (test)
Safety Score0.55
60
Jailbreak Attack TransferabilityLlama-2-7b-chat finetuned variants v1 (test)
Transfer Success Rate (TSR)60.4
16
Watermark Attack Success RateLlama-2-7b-chat-hf UMD watermarking (10 samples)
ASR100
15
LLM QuantizationLlama-2-70B
GPU Hours (h)2.2
13
LLM Inference VerificationLlama-2 7B
Verification Latency (s)0.17
12
Training Stability AnalysisLlama-2 7B pre-training
Number of Spikes0
9
Attribute SteeringLlama-2-7b-Chat-hf Open-Ended Generation
Wealth Score2.46
7
Decoding LatencyLlama-2-7B 32k sequence length v1 (inference)
Decoding Latency (s)0.062
6
Decoding LatencyLlama-2-7B 16k sequence length v1 (inference)
Decoding Latency (s)0.041
6
Decoding ThroughputLlama 2 7B inference v1.0
Decoding Throughput (TOK/s)188
6
Model FingerprintingLLaMA-2 7B fine-tuned variants
U-test p-value0
5
LLM InferenceLLaMA-2 70B sequence length 2048
Max Batch Size384
5
Decoding LatencyLlama-2-7B 64k sequence length v1 (inference)
Decoding Latency (s)0.098
5
Decoding ThroughputLlama 2 70B v1.0 (inference)
Throughput (TOK/s)23.5
5
Training Stability AnalysisLlama-2 1.3B (train)
Number of Spikes0
4
Computational EfficiencyLLaMA-2-7B (test)
Total Time1,495
4
LLM ServingLLaMA-2 70B chatbot workload
TTFT (ms)45.2
4
LLM DecodingLlama-2-70B
Per-step Decoding Latency0.2163
4
Machine TranslationLlama-2-13B-chat Seen Languages (tl2en)
BLEU32.6
3
Machine TranslationLlama-2-13B-chat Seen Languages (en2tl)
BLEU18.3
3
Machine TranslationLlama-2-13B-chat Seen Languages (sq2en)
BLEU17.5
3
Machine TranslationLlama-2-13B-chat Seen Languages (en2sq)
BLEU9.7
3
Machine TranslationLlama-2-13B-chat Seen Languages sk2en
BLEU35.4
3
Showing 25 of 43 rows