Let be a vocabulary of words, and let be a word. A Language Model (LM) is a probability distribution over the sequences of words, i.e. .
The task of a language model is to compute the probability of a given sentence.
Types of LM
- Unigram basic (false) assumption that the probability of each word is indipendent, so that
- N-gram dependence up to context size :
- Neural