Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context Reasoning on LongBench 2
Loading...
50.3
P@4
MPD
36.052
39.751
43.45
47.149
May 9, 2026
P@4
Token Usage
Updated 22d ago
Evaluation Results
Method
Method
Links
P@4
Token Usage
MPD
2026.05
50.3
700
CRISP
2026.05
49.9
1,000
Vanilla LLM
2026.05
48.7
900
Direct Comp.
2026.05
48.1
800
Chain-of-Draft
2026.05
40.8
1,800
LiteCoT
2026.05
36.6
2,000
Feedback
Search any
task
Search any
task