Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Language Modeling on WikiText-2 vLLM harness (test)
Loading...
8.87
Perplexity (PPL)
Llama 3.1-8B-Instruct: Baseline (FP16)
8.6564
10.0982
11.54
12.9818
Mar 18, 2026
Perplexity (PPL)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Perplexity (PPL)
Llama 3.1-8B-Instruct: Baseline (FP16)
Method=Full precision,...
2026.03
8.87
GGUF Q5 K S
Method=Uniform quant.,...
2026.03
8.99
FP8
Method=Uniform quant.,...
2026.03
9.04
GPTQ INT4
Method=Uniform quant.,...
2026.03
9.3
Clustered (Orig) — no training
Method=K-means, Disk (...
2026.03
9.32
AWQ INT4
Method=Uniform quant.,...
2026.03
9.35
AQLM 2-bit
Method=Codebook, Disk...
2026.03
11.77
Compressed 3B-Llama
Method=CompactifAI, Di...
2026.03
12.62
Compressed + Clustered + Fine-tuned
Method=CompactifAI + K...
2026.03
13.05
Compressed + Clustered
Method=CompactifAI + K...
2026.03
13.36
Compressed + Clustered + AWQ
Method=CompactifAI + K...
2026.03
13.86
Compressed + Clustered + GPTQ
Method=CompactifAI + K...
2026.03
14.21
Feedback
Search any
task
Search any
task