An n-gram model builds upon the bag of words approach by considering n consecutive tokens in addition to the individual tokens of a document. For example, a bigram (n=2) would create features from each pair of consecutive tokens in the text. If a document consists of the sentence “It is sunny and 75 degrees”, in addition to the individual words, an n-gram of size 2 would consist of the pairs [“It is”, “is sunny”, “sunny and”, “and 75”, “75 degrees”].