What is Term Frequency (TF)? 

A Term Frequency matrix consists of the IDs for the documents in the corpus for the rows and all of the words in the vocabulary in the columns. A given entry in a TF matrix is interpreted as the number of occurrences of word w in document d. If the value is 0, that word does not appear in document d. In a large corpus, there will likely be many words as part of the vocabulary, so this is usually a large sparse matrix.