One of the main drawbacks of deep learning is that it is more prone to overfitting than more traditional machine learning models. However, there are some options at hand that can be employed to mitigate the risk of overfitting, Dropout being one of them.
Dropout refers to randomly turning off hidden units so that a smaller network is trained on a given pass through the dataset. Basically, each node within the hidden layers has a probability of being turned off, so if the network is trained over multiple iterations of the data, the data is fed through different but simpler networks that result in lower variance than if the same, more complex model was used in each pass. Thus, dropout essentially achieves the same reduction in variance as creating an ensemble of complex networks.