SOTA LLM Inference Performance on Context Length 10K and PapersWithCode

0.45Prefill Time (s)

IndexCache

Updated 2mo ago

Evaluation Results