| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| Mobile and IoT Device Latency Benchmark llama.cpp (inference evaluation) | Throughput (Sample, tokens/s)4,215 | 16 | 1mo ago | ||
| Llama-2 7B-Chat | ARC engine | Latency (ms/token)76 | 4 | 22d ago | |
| Sequence Bucket Ultra-long | COREY | Latency (ms)77.97 | 3 | 5d ago | |
| Sequence Bucket Long | COREY | Latency (ms)69.58 | 3 | 5d ago | |
| Sequence Bucket Medium | COREY | Latency (ms)52.88 | 3 | 5d ago | |
| Sequence Bucket Short | COREY | Latency (ms)39.26 | 3 | 5d ago | |
| stories100m 110M parameters | Tokens/s298.7 | 3 | 1mo ago |