Publications

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training, Schaipp F, Hägele A, Taylor A, Simsekli U, Bach F. ICML 2025. [abs] [pdf] [poster]
MoMo: Momentum Models for Adaptive Learning Rates, Schaipp F, Ohana R, Eickenberg M, Defazio A, Gower R. ICML 2024. [abs] [pdf] [poster]
A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction, Milzarek A, Schaipp F, Ulbrich M. SIOPT 2024. [abs] [pdf]
SGD with Clipping is Secretly Estimating the Median Gradient, Schaipp F, Garrigos G, Simsekli U, Gower R. Arxiv preprint 2024. [abs] [pdf]
Robust gradient estimation in the presence of heavy-tailed noise , Schaipp F, Simsekli U, Gower R. NeurIPS Workshop Heavy Tails in Machine Learning 2023.[abs] [pdf]
A Stochastic Proximal Polyak Step Size, Schaipp F, Gower R, Ulbrich M. TMLR 2023. [abs] [pdf]
GGLasso - a Python package for General Graphical Lasso computation, Schaipp F, Vlasovets O, Müller C. JOSS 2021. [abs] [pdf]