A Term Frequency matrix consists of the IDs for the documents in the corpus for the rows and all of the words in the vocabulary in the columns. A given entry in a TF matrix is interpreted as the number of occurrences of word w in document d. If the value is 0, that word does not appear in document d. In a large corpus, there will likely be many words as part of the vocabulary, so this is usually a large sparse matrix.
The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.