Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Long-Context

Benchmarks

Task NameDataset NameSOTA ResultTrend
Many-shot in-context learningLong-context benchmarks
ICL Performance (8k Context)74.2
21
Context ManagementLong-context (test)
mTokens1
19
Average across tasksLong-context benchmarks
Performance (8k Context)45.9
8
Long-Context TrainingLong-Context (train)
Metric-
0
Showing 4 of 4 rows