Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Character-Aware Neural Language Models

About

We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic, Czech, French, German, Spanish, Russian), the model outperforms word-level/morpheme-level LSTM baselines, again with fewer parameters. The results suggest that on many languages, character inputs are sufficient for language modeling. Analysis of word representations obtained from the character composition part of the model reveals that the model is able to encode, from characters only, both semantic and orthographic information.

Yoon Kim, Yacine Jernite, David Sontag, Alexander M. Rush• 2015

Related benchmarks

TaskDatasetResultRank
Language ModelingPenn Treebank (test)
Perplexity78.9
411
Language ModelingPenn Treebank (val)
Perplexity82
178
Language ModelingOne Billion Word Benchmark (test)
Test Perplexity25.88
108
Language ModelingPenn Treebank word-level (test)
Perplexity78.9
72
Readability AssessmentWeeBit
Rank21
21
Text Readability AssessmentOneStopEnglish
Pearson Correlation (ρ)0.42
21
Readability AssessmentOneStopEnglish
Rank16
21
Text Readability AssessmentNewsela
Pearson Correlation (ρ)0.512
21
Readability AssessmentNewsela
Rank17
21
Text Readability AssessmentWeeBit
Pearson Correlation (ρ)-0.082
21
Showing 10 of 18 rows

Other info

Follow for update