Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-hop Reasoning on CWQ
Loading...
82.2
Hits@1
LMP
58.384
64.567
70.75
76.933
Apr 14, 2026
Hits@1
Updated 3d ago
Evaluation Results
Method
Method
Links
Hits@1
LMP
LLM=GPT-4, Size=-
2026.04
82.2
KG-Reasoner
LLM=Qwen3, Size=30B
2026.04
78.14
PoG
LLM=GPT-4, Size=-
2026.04
75
iQUEST
LLM=GPT-4o, Size=-
2026.04
73.8
ORT
LLM=DeepSeek-v3, Size=...
2026.04
72.9
KG-Agent
LLM=LLaMA3, Size=7B
2026.04
72.2
KBQA-o1
LLM=LLaMA3, Size=70B
2026.04
72
ToG
LLM=GPT-4, Size=-
2026.04
67.6
RoG
LLM=LLaMA2, Size=7B
2026.04
62.6
LightPROF
LLM=LLaMA3, Size=8B
2026.04
59.3
Feedback
Search any
task
Search any
task