Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Efficiency on Alpaca
Loading...
102.44
Latency (ms/tok)
Baseline
101.7236
106.5593
111.395
116.2307
Aug 14, 2025
Latency (ms/tok)
Throughput (tok/s)
VRAM (GB)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Latency (ms/tok)
Throughput (tok/s)
VRAM (GB)
Baseline
Backbone=Llama3-8B-Ins...
2025.08
102.44
9.76
16.1
CAA
Backbone=Llama3-8B-Ins...
2025.08
110.59
9.04
16.12
ReFT
Backbone=Llama3-8B-Ins...
2025.08
115.72
8.64
16.26
MSRS
Backbone=Llama3-8B-Ins...
2025.08
120.35
8.38
16.29
Feedback
Search any
task
Search any
task