Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions
Authors: Ashia Wilson, Lester Mackey, Andre Wibisono
Abstract: We present a family of algorithms, called descent algorithms, for optimizing convex and non-convex functions. We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster convergence rate than gradient descent provided the function is strongly smooth -- a natural generalization of the standard smoothness assumption on the objective function. When the objective function is convex, we present two novel frameworks for "accelerating" descent methods, one in the style of Nesterov and the other in the style of Monteiro and Svaiter, using a single Lyapunov. Rescaled gradient descent can be accelerated under the same strong smoothness assumption using both frameworks. We provide several examples of strongly smooth loss functions in machine learning and numerical experiments that verify our theoretical findings. We also present several extensions of our novel Lyapunov framework, including deriving optimal universal tensor methods and extending our framework to the coordinate setting.
Explore the paper tree
Click on the tree nodes to be redirected to a given paper and access their summaries and virtual assistant
Look for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.