Demystifying Different Variants of Gradient Descent Optimization Algorithm

Demystifying Different Variants of Gradient Descent Optimization Algorithm

Neural Networks that represent a supervised learning method, requires a large training set of complete records, including the target variable. Training a deep neural network to find the best parameters of that network is an iterative process, but training deep neural networks on a large data set iteratively is very slow. So what we need is that by having a good optimization algorithm to update the parameters (weights and biases) of the network can speed up the learning process of the network. The choice of optimization algorithms in deep learning can influence the network training speed and its performance.