Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference Latency on WikiText-103, SQuAD v2, and OpenBookQA
Loading...
57.9
Inference Latency (ms)
LLMCache
48.024
114.687
181.35
248.013
Dec 18, 2025
Inference Latency (ms)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Inference Latency (ms)
LLMCache
Model=DistilBERT
2025.12
57.9
LLMCache
Model=BERT-base
2025.12
91.3
LLMCache
Model=GPT-2 small
2025.12
112.5
NoCache
Model=DistilBERT
2025.12
123.4
KV-Cache
Model=GPT-2 small
2025.12
177.3
NoCache
Model=BERT-base
2025.12
218.6
NoCache
Model=GPT-2 small
2025.12
304.8
Feedback
Search any
task
Search any
task