Trends in input representation for state-of-art NLP models (2019)

The most natural/intuitive way to represent words when they are input to a language model (or any NLP task model) is to just represent words as they are — as a single unit.

For example, if we are training a language model on a corpus, we would traditionally represent each word as a vector and have the model learn word embeddings — values for each dimension of that vector. Then subsequently, at test time, if we are given a new sentence, the language model can compute how likely that sentence is using those learnt word…