Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-turn Text-to-SQL on SParC Out-of-domain

77.4Execution Accuracy (EX)

[Long-Horizon] Warm-Start SFT + RL (Outcome + Process)

57.1262.38567.6572.915Oct 12, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
77.469.1
7669
2025.10
75.168.9
74.568
2025.10
7464
2025.10
7464.4
2025.10
73.762.4
2025.10
73.464
7366.2
2025.10
71.765.1
2025.10
71.362.7
2025.10
70.662.3
2025.10
7061.5
2025.10
67.358.1
2025.10
61.956.1
2025.10
57.948.5