10.06.2025 16:00 Prof. Johannes Maly:
Analyzing the implicit regularization of Gradient Descent5608.EG.038 (Boltzmannstr. 3, 85748 Garching)

Gradient descent (GD) and its variants are vital ingredients in neural network training. It is widely believed that the impressive generalization performance of trained models is partially due to some form of implicit bias of GD towards specific minima of the loss landscape. In this talk, we will review and discuss approaches to rigorously identify and analyze implicit regularization of GD in simplified training settings. We furthermore provide evidence suggesting that a single implicit bias is not sufficient to explain the effectiveness of GD in training tasks.