Explain the basic architecture and training process of a Neural Network model? Discuss briefly the key hyper-parameters
What is an activation function? What are the different types of activation functions? Discuss their pros and cons
What is an activation function, and what are some of the most common choices for activation functions?
What do you mean by saturation in neural network training? Discuss the problems associated with saturation