The website is in Maintenance mode. We are in the process of adding more features.
Any new bookmarks, comments, or user profiles made during this time will not be saved.

Machine Learning Resources

What is the problem with using a generic list of stop words? 

Bookmark this question

The generic list of English stop words may not be appropriate if the set of documents are all related to a specific domain. For example, if the documents are all about finance, words like “money” and “economy” might occur very frequently and provide little differentiation between documents. If that is the case, it is a good preprocessing step to append these niche-specific words to the list of stopwords to prune from the vocabulary before performing vectorization.

Leave your Comments and Suggestions below:

Please Login or Sign Up to leave a comment

Partner Ad  

Find out all the ways
that you can

Explore Questions by Topics

Partner Ad

Learn Data Science with Travis - your AI-powered tutor |