Abstract: Stochastic Gradient Descent (SGD) is the method of choice for large scale problems, most notably in deep learning. Recent studies target improving convergence and speed of the SGD algorithm.