TalkTag: Fine-Grained Morphosyntactic Error Annotation for Transcribed Speech
About
Fine-grained morphosyntactic error annotation is important in clinical and developmental language research, yet it is labour-intensive, expert-dependent, and difficult to scale. We present TalkTag, an LLM-based lightweight tool fine-tuned to automate CHAT-style error annotation in spoken-language transcripts. Developed under conditions of extreme data scarcity using children's narrative data, the system shows the feasibility of linguistic analysis in low-resource settings. Our evaluation demonstrates that TalkTag produces encouragingly precise annotation while effectively identifying instances where linguistic ambiguity makes automated tagging genuinely complex. In summary, with TalkTag, we provide a scalable alternative to manual error annotation and practically viable support for morphosyntactic error annotation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Morphosyntactic Error Annotation | ENNI tagged utterances raw (test) | Exact Match (EM)82.8 | 2 | |
| Morphosyntactic Error Annotation | ENNI raw (val) | EM95.4 | 1 | |
| Morphosyntactic Error Annotation | ENNI raw (test) | Exact Match (EM)93.6 | 1 |