Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InfiniteBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context language understandingInfiniteBench
En.Sum32.93
63
Long-context understandingInfiniteBench v1 (test)
Dialogue20
31
Long-context understandingInfiniteBench
En. MC Accuracy0.6812
12
Key-Value RetrievalInfiniteBench 8k
Accuracy96
12
Key-Value RetrievalInfiniteBench 4k
Accuracy100
12
Long-context language modelingInfiniteBench (test)
En Sum Score1
10
Key-Value RetrievalInfiniteBench 16k
Accuracy (%)87
10
Long-context reasoningInfiniteBench (test)
Reasoning Pa Score87.63
6
Long-context understandingInfiniteBench (test)
En QA F136.7
6
Long context understandingInfiniteBench En.MC
Accuracy83.4
5
Long-context language understandingInfiniteBench
InfiniteBench QA (EN) Score7.84
4
Math FindInfiniteBench
Performance (8k Context)37.14
3
KVInfiniteBench
KV Retrieval Score (8k)6.2
3
Showing 13 of 13 rows