Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Spotting on Loong Set 4: 200K–250K Tokens
Loading...
57.74
LLM Score
Disco-RAG
15.8904
26.7552
37.62
48.4848
Jan 7, 2026
LLM Score
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
Exact Match
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
57.74
27
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
57.27
22
StructRAG
Condition=SOTA Results
2026.01
56.87
19
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
56.68
19
Llama-3.3-70B
Condition=Standard RAG
2026.01
40.27
25
Qwen2.5-72B
Condition=Standard RAG
2026.01
40.14
16
Llama-3.1-8B
Condition=Standard RAG
2026.01
40.01
11
Llama-3.3-70B
Condition=Full Context
2026.01
36.76
21
Qwen2.5-72B
Condition=Full Context
2026.01
34.22
18
Llama-3.1-8B
Condition=Full Context
2026.01
31.79
12
RQ-RAG
Condition=SOTA Results
2026.01
29.17
8
GraphRAG
Condition=SOTA Results
2026.01
17.5
0
Feedback
Search any
task
Search any
task