Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
End-to-End Question Answering on MCP-Bench
Loading...
87.5
Accuracy (Human)
TURA
64.412
70.406
76.4
82.394
Aug 6, 2025
Accuracy (Human)
Accuracy (LLM)
Faithfulness (Human)
Faithfulness (LLM)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Accuracy (Human)
Accuracy (LLM)
Faithfulness (Human)
Faithfulness (LLM)
TURA
2025.08
87.5
88.3
96.2
97.1
Tool-Agent
2025.08
76.8
80.4
81.7
83.9
Dynamic RAG
2025.08
67.2
69.5
77.6
79.4
LLM + RAG
2025.08
65.3
68.1
72.4
75
Feedback
Search any
task
Search any
task