Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Inference on Qwen 0.5B Instruct 2.5

185.5Throughput (Tok/s)

CUDA (compiled, RTX 5090)

6.20452.75299.3145.848Feb 9, 2026
Updated 13d ago

Evaluation Results

MethodLinks
2026.02
185.5184.20.95.41
2026.02
182.9182.30.45.50.99
2026.02
47.847.70.920.90.26
2026.02
2120.7441.60.11
2026.02
13.713.43.272.80.07
2026.02
13.1131.173.50.07