Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LongReason

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context reasoningLongReason 64K-input 70K context
Accuracy71.25
34
Long-context reasoningLongReason
Score86.9
18
Multi-choice reasoningLongReason
Accuracy (32k)84.13
17
Question AnsweringLongReason
Acc72.3
15
RL TrainingLongReason
Peak Memory (GB)80
6
ReasoningLongReason (val)
Accuracy (val)79.3
4
Showing 6 of 6 rows