Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Clarification Generation on DeepResearch Bench offline (test)
Loading...
2.43
Quality Score
IntentRL
1.4004
1.6677
1.935
2.2023
Feb 3, 2026
Quality Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Quality Score
IntentRL
2026.02
2.43
Learn-to-Ask
baseline_type=proactiv...
2026.02
2.28
CollabLLM
baseline_type=proactiv...
2026.02
2.15
Tell Me More
baseline_type=proactiv...
2026.02
1.44
Feedback
Search any
task
Search any
task