Machine Learning Resources

What is bootstrapping, and why is it a useful technique?

Bootstrapping refers to the process of sampling data with replacement, meaning that after any observation is sampled, the same observation can be sampled again at a later point in the process. This is different from sampling without replacement, where once an observation is drawn, it cannot be sampled again.

Bootstrapping is useful when there are few observations in the original data set or it would be difficult to repeat the experiment on a separate sample of data, as it is a way to augment the data available by generating more samples than original observations. A sampling distribution can be created from a bootstrapped data set, where quantities such as the mean and quantiles can be estimated from the said distribution. Bootstrapping has the advantage of being able to conduct inference without assumptions about the original distribution of the data or test statistics. 

Find out all the ways
that you can