Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

About

Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We propose \emph{grammar prompting}, a simple approach to enable LLMs to use external knowledge and domain-specific constraints, expressed through a grammar in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and SMILES-based molecule generation.

Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, Yoon Kim• 2023

Related benchmarks

TaskDatasetResultRank
Semantic ParsingSMCalFlow
Program Accuracy88.9
22
Semantic ParsingGeoQuery Few-shot 32 examples (test)
Program Accuracy95.7
8
Semantic ParsingSMCalFlow Few-shot 16 examples (test)
Program Accuracy83.6
8
Semantic ParsingOvernight Blk Few-shot 32 examples (test)
Program Acc74.4
8
PDDL PlanningPDDL Blocks domain Pyperplan evaluation tasks
Nodes Created170
6
Semantic ParsingGeoQuery Length split
Execution Accuracy95.7
6
PDDL PlanningPDDL Depot domain Pyperplan evaluation tasks
Nodes Created2.92e+3
6
PDDL PlanningPDDL Satellite domain Pyperplan evaluation tasks
Nodes Created5.16e+3
6
Semantic ParsingGeoquery
Execution Accuracy98.6
4
Semantic ParsingOvernight Blk
Execution Acc97.2
4
Showing 10 of 15 rows

Other info

Code

Follow for update