Gradient descent (GD) and its variants are vital ingredients in neural network training. It is widely believed that the impressive generalization performance of trained models is partially due to some form of implicit bias of GD towards specific minima of the loss landscape. In this talk, we will review and discuss approaches to rigorously identify and analyze implicit regularization of GD in simplified training settings. We furthermore provide evidence suggesting that a single implicit bias is not sufficient to explain the effectiveness of GD in training tasks.