Our new X account is live! Follow @wizwand_team for updates
Search any
task
Feedback
Search any
task
SOTA Inference Efficiency benchmarks and papers with code | Wizwand
Our new X account is live! Follow @wizwand_team for updates
Home
/
Tasks
Inference Efficiency
Benchmarks
Dataset Name
SOTA Method
Dataset Name
SOTA Method
Metric
Trend
Results
Last Updated
HumanEval
FLy
Speedup Factor
5.15
54
3d ago
DeepScaleR-40k (1,024 mathematical problems)
G-KV
Throughput (tokens/s)
760.74
26
3d ago
Samsung Galaxy S25 Qualcomm Snapdragon 8 Elite SoC inference v1.0
LFM2-350M
Prefill Throughput (1K) (tokens/s)
1,067
20
3d ago
ImageNet-1k
Gaussian
Inference Length
-4.27
20
3d ago
MS-COCO
Gaussian
Sequence Length Delta
-18.33
20
3d ago
Model Profiling
ESPACE
Total GEMM Latency (ms)
15.9
19
3d ago
Synthetic Lego scene (test)
D-NeRF
Storage (MB)
4
15
3d ago
HotpotQA
SeleCom
Time to Last Token (ms)
496
12
3d ago
Natural Questions (NQ)
COCOM
TIL (ms)
488
12
3d ago
Inference Efficiency Evaluation
CS-LSTMs
Inference Latency (s)
0.0046
12
3d ago
openPangu Embedded Efficiency Benchmark
openPangu-Embedded
Prefill Latency (ms)
528
10
3d ago
HAGRID
SAM-Decoding[E2]
#MAT
4.75
9
3d ago
LLaVA 7B 1.5
TrimTokenator-LC
Latency (ms)
802.65
8
3d ago
KV Cache Efficiency
L2-7B (Base)
SKV Count
1,966
7
3d ago
LLaVA-NeXT Inference
Vanilla
Inference Time (s)
7.998
6
3d ago
Custom Efficiency Setup 640x480 resolution
AuroraEdge-V-2B
GFLOPS
263.8
4
3d ago
90k Context Length Llama-3.1-8B
Finetuning
Throughput (queries/s)
8.9
4
3d ago
30k Context Length (Llama-3.1-8B)
Finetuning
Inference Throughput (QPS)
15.8
4
3d ago
30k Context Length Llama-2-7B
Finetuning
Inference Throughput (QPS)
6.6
4
3d ago
POPE (test)
VisionZip
Total Time (ms)
756,000
4
2d ago
Showing 20 of 20 rows
25 / page
50 / page
100 / page
1
Search any
task
Search any
task
Terms of Service
FAQs