In Neural Networks, a batch is a subset of the training data that the network sees before the parameters are updated. If the batch size is the same as the full training data, the weights and biases are not updated until the entire training data is passed through the network. However, using a smaller batch size allows for gradient descent to move faster, as the parameters are updated multiple times before the full data set has been processed. This is referred to as mini-batch gradient descent. A batch size of one equates to stochastic gradient descent, in which the parameters are updated after seeing each observation.

An epoch refers to a complete pass through the entire training data. If batch gradient descent is used, one epoch is equal to one batch; however, in mini-batch gradient descent, one epoch is reached only after processing all of the mini-batches that comprise the training dataset. In order to reach an optimal solution to a complex problem, it is often necessary for the network to pass through the data many times, which means the number of epochs is much larger than 1.