Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Long-context retrieval and reasoning on Loong full (evaluation)
Loading...
68
Average Score
Baseline (Full Context)
31.392
40.896
50.4
59.904
Mar 9, 2026
Average Score
Precision (PR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Average Score
Precision (PR)
Baseline (Full Context)
Avg Input Tok=253,085,...
2026.03
68
31.4
SPD-RAG
Avg Input Tok=193,954,...
2026.03
58.1
18.6
Normal RAG
Avg Input Tok=22,174,...
2026.03
33
13.7
Agentic RAG
Avg Input Tok=81,877,...
2026.03
32.8
8.8
Feedback
Search any
task
Search any
task