Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Transformed Protoform Reconstruction

About

Protoform reconstruction is the task of inferring what morphemes or words appeared like in the ancestral languages of a set of daughter languages. Meloni et al. (2021) achieved the state-of-the-art on Latin protoform reconstruction with an RNN-based encoder-decoder with attention model. We update their model with the state-of-the-art seq2seq model: the Transformer. Our model outperforms their model on a suite of different metrics on two different datasets: their Romance data of 8,000 cognates spanning 5 languages and a Chinese dataset (Hou 2004) of 800+ cognates spanning 39 varieties. We also probe our model for potential phylogenetic signal contained in the model. Our code is publicly available at https://github.com/cmu-llab/acl-2023.

Young Min Kim, Kalvin Chang, Chenxuan Cui, David Mortensen• 2023

Related benchmarks

TaskDatasetResultRank
Linguistic ReconstructionRom-phon
PED0.9027
10
Linguistic ReconstructionSinitic
PED0.9814
6
Protoform reconstructionSinitic
PED0.9814
6
Linguistic ReconstructionRom-orth
PED0.5568
5
Protoform reconstructionRom-orth
PED0.5568
5
Word ReconstructionRomance Rom-phon
PED1.2516
5
Word ReconstructionRomance Rom-orth
PED1.1622
5
Showing 7 of 7 rows

Other info

Code

Follow for update