Explain the basic architecture and training process of a Neural Network model? Discuss briefly the key hyper-parameters
What is an activation function, and what are some of the most common choices for activation functions?
Briefly describe the architecture of a Recurrent Neural Network (RNN) and how it addresses the shortcomings of traditional Neural Networks.
What are transformers? Discuss the evolution, advantages and major breakthroughs in transformer models
What do you mean by saturation in neural network training? Discuss the problems associated with saturation
What is an activation function? What are the different types of activation functions? Discuss their pros and cons