Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Inference Efficiency on LLaMA2-7B (12/128 tokens)

1.889Latency

SWM (Ours)

1.738362.755183.7724.78882Feb 26, 2025
Updated 29d ago

Evaluation Results

MethodLinks
2025.02
1.88967.778,725.9
2025.02
1.93866.0488,725.9
2025.02
2.08461.4338,725.85
2025.02
2.33954.75810,682.45
2025.02
2.52950.62210,682.45
2025.02
2.58549.54210,682.45
2025.02
2.72946.90513,020.25
2025.02
4.04531.65610,707.25
2025.02
4.12731.0518,855.95
2025.02
4.61927.7268,901
2025.02
4.62827.66310,676
2025.02
5.6322.7369,043.9
2025.02
5.65522.63510,951.5