Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Semantic Parsing on CFQ (MCD metrics)
Loading...
92.1
MCD1 Score
GRPO-Composite
88.3768
89.3434
90.31
91.2766
May 6, 2026
MCD1 Score
MCD2 Score
MCD3 Score
Average MCD Score
Updated 27d ago
Evaluation Results
Method
Method
Links
MCD1 Score
MCD2 Score
MCD3 Score
Average MCD Score
GRPO-Composite
Base Model=Llama-3.1-8...
2026.05
92.1
76.16
83.44
83.9
GRPO-Binary
Base Model=Llama-3.1-8...
2026.05
91.35
76.44
81.46
83.08
SFT
Base Model=Llama-3.1-8...
2026.05
91.04
74.85
77.42
81.1
GRPO-Binary
Base Model=Qwen-2.5-7B...
2026.05
89.23
75.73
76.54
80.5
GRPO-Composite
Base Model=Qwen-2.5-7B...
2026.05
89.16
77.33
75.86
80.78
SFT
Base Model=Qwen-2.5-7B...
2026.05
88.52
73.98
73.67
78.72
Feedback
Search any
task
Search any
task