至尊国际

至尊国际» 科学研究» 学术报告» 讨论班» Computational and Applied Math

讨论班

Corner Gradient Descent: provable acceleration of power-law convergence of SGD

报告人：Dmitry Yarotsky (Steklov Institute of Mathematics)

时间：2026-03-26 16:00–17:00

地点：Zoom ID：810 6791 0505 (Password: 765324)

Abstract:

It is well-known, both practically and theoretically, that momentum allows to accelerate gradient descent (GD). In ill-conditioned problems with power-law spectral data, momentum with a suitable schedule allows to double the convergence exponent. However, this approach to acceleration fails for Stochastic GD: for any fixed batch size the optimization diverges.

We show, however, that acceleration of SGD can be achieved by what we call Corner Gradient Descent. The key idea is to extend GD by linear memory and identify different such extensions with different contours in the complex plane. Corner algorithms correspond to contours having a corner with an external angle $\theta\pi$ with some $1 < \theta < 2$. It turns out that such algorithms accelerate the convergence exponent of non-stochastic GD by the factor $\theta$. In the stochastic case the effect is more complex and is described by a phase diagram; in one of its regions the acceleration factor can be made arbitrarily close to 2.

Publication: //openreview.net/forum?id=nOXCfIdhD9

Bio:

Dmitry Yarotsky obtained his PhD in mathematics from Moscow State University in 2002, and later worked at Institute for Information Transmission Problems, Dublin Institute for Advanced Studies, Munich University, Skoltech, and Steklov Institute of Mathematics. Dmitry's interests cover a wide range of topics in applied mathematics, from mathematical physics to data analysis and optimization. His current focus is rigorous results on expressiveness of neural networks and gradient-based optimization.

Join Zoom Meeting

//us02web.zoom.us/j/81067910505?pwd=ReSIxXyA90zOTSA0zYuKzl2GZmacda.1
Meeting ID: 810 6791 0505
Passcode: 765324

TOP

至尊国际

北大数学成就展

人才引进

捐赠