Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Spotting on Loong Set 2: 50K–100K Tokens
Loading...
69.92
LLM Score
Disco-RAG
22.9952
35.1776
47.36
59.5424
Jan 7, 2026
LLM Score
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
LLM Score
Exact Match
Disco-RAG
Base Model=Llama-3.3-70B
2026.01
69.92
39
StructRAG
Condition=SOTA Results
2026.01
68
41
Disco-RAG
Base Model=Qwen2.5-72B
2026.01
67.17
36
Disco-RAG
Base Model=Llama-3.1-8B
2026.01
66.03
36
Llama-3.3-70B
Condition=Standard RAG
2026.01
60.38
27
Qwen2.5-72B
Condition=Standard RAG
2026.01
60.13
26
RQ-RAG
Condition=SOTA Results
2026.01
57.35
35
Llama-3.1-8B
Condition=Standard RAG
2026.01
57.02
25
Llama-3.3-70B
Condition=Full Context
2026.01
55.27
34
Qwen2.5-72B
Condition=Full Context
2026.01
52.37
30
Llama-3.1-8B
Condition=Full Context
2026.01
51.3
27
GraphRAG
Condition=SOTA Results
2026.01
24.8
0
Feedback
Search any
task
Search any
task