Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
LLM Inference on LLaMA-2 70B sequence length 2048
Loading...
384
Max Batch Size
CXL-SpecKV + Comp
1.28
100.64
200
299.36
Dec 11, 2025
Max Batch Size
Available Memory (GB)
Expansion Factor
Updated 4d ago
Evaluation Results
Method
Method
Links
Max Batch Size
Available Memory (GB)
Expansion Factor
CXL-SpecKV + Comp
System=CXL-SpecKV + Comp
2025.12
384
-
24
CPU Offload
System=CPU Offload
2025.12
192
-
12
CXL-SpecKV
System=CXL-SpecKV
2025.12
128
-
8
GPU + Compression
System=GPU + Compression
2025.12
48
-
3
GPU-Only
System=GPU-Only
2025.12
16
-
1
Feedback
Search any
task
Search any
task