Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Inference on Qwen 1.5B Instruct 2.5

155.3Throughput (tok/s)

CUDA (eager, RTX 5090)

4.60443.72782.85121.973Feb 9, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.02
155.3154.90.6-1
2026.02
20.620.42.9-0.13
2026.02
17.917.73.851.30.12
2026.02
10.410.40.987.90.07