Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CAIT: A Syntactic Parsing Toolkit for Child-Adult InTeractions

About

CHILDES is a paramount resource for language acquisition studies -- yet computational tools for analyzing its syntactic structure remain limited. Leveraging the recent release of the UD-English-CHILDES treebank with gold-standard Universal Dependencies (UD) annotations, we train a state-of-the-art dependency parser specifically tailored to CHILDES. The parser more accurately captures syntactic patterns in child--adult interactions, outperforming widely used off-the-shelf English parsers, including SpaCy and Stanza. Alongside the parser, we also release a Part-of-Speech tagger and an utterance-level construction tagger, which together form the open-source Syntactic Parsing Toolkit for Child--Adult InTeractions (CAIT). Through a detailed error analysis and a case study tracking the distribution of syntactic constructions across developmental time in CHILDES, we demonstrate the practical utility of the toolkit for large-scale, reproducible research on language acquisition.

Francesca Padovani, Xiulin Yang, Bastian Bunzeck, Jaap Jumelet, Yevgen Matusevych, Nathan Schneider, Arianna Bisazza• 2026

Related benchmarks

TaskDatasetResultRank
Dependency ParsingUD CHILDES (dev)
UAS96.23
13
Dependency ParsingUD CHILDES (test)
UAS0.9491
13
Construction TaggingCHILDES MPI-EVA-Manchester (test)
Accuracy92.32
4
Construction TaggingMPI-EVA-Manchester (CHILDES) (dev)
Accuracy92.05
3
Showing 4 of 4 rows

Other info

Follow for update