Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Task-oriented Dialogue on Frames
Loading...
50.57
Success Rate (SR)
VLK-RL
25.2044
31.7897
38.375
44.9603
Apr 25, 2026
Success Rate (SR)
Hit Rate (HR)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Hit Rate (HR)
VLK-RL
backbone=Qwen-14B
2026.04
50.57
3.32
VLK-RL
backbone=Qwen-7B
2026.04
48.23
3.16
VLK-RL
backbone=GPT-4o-mini
2026.04
45.89
3.11
CAPID
2026.04
39.27
2.51
GDP-Zero
2026.04
38.96
2.49
TransferTOD
2026.04
37.41
2.45
GALAXY
2026.04
36.12
2.34
ACGOS
2026.04
31.85
2.22
PPO
2026.04
26.18
2.08
Feedback
Search any
task
Search any
task