Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Explain-then-Process: Using Grammar Prompting to Enhance Grammatical Acceptability Judgments

About

Large language models (LLMs) can explain grammatical rules, yet they often fail to apply those rules when judging sentence acceptability. We present "grammar prompting", an explain-then-process paradigm: a large LLM first produces a concise explanation of the relevant syntactic phenomenon, then that explanation is fed back as additional context to the target model -- either an LLM or a smaller language model (SLM) -- before deciding which sentence of a minimal pair is grammatical. On the English BLiMP, Chinese SLING, and Russian RuBLiMP benchmarks, this simple prompt design yields substantial improvements over strong baselines across many syntactic phenomena. Feeding an LLM's metalinguistic explanation back to the target model bridges the gap between knowing a rule and using it. On SLMs, grammar prompting alone trims the average LLM-SLM accuracy gap by about 20%, and when paired with chain-of-thought, by 56% (13.0 pp -> 5.8 pp), all at negligible cost. The lightweight, language-agnostic cue lets low-cost SLMs approach frontier-LLM performance in multilingual settings.

Russell Scheinberg, Ameeta Agrawal, Amber Shore, So Young Lee• 2025

Related benchmarks

TaskDatasetResultRank
Grammaticality JudgmentRuBLiMP Russian 1.0
Anaphora96.7
28
Linguistic Minimal Pair EvaluationBLiMP (test)
NPI lic. (2)100
28
Syntactic EvaluationSLING Chinese-language
Alt. Quest.100
28
Linguistic Minimal Pair EvaluationBLiMP, SLING, and RuBLiMP Combined (test)
Average Score0.928
4
Showing 4 of 4 rows

Other info

Follow for update