TheoryofDeepLearning.2022
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
algorithmic regularization 55
Since lim t
w t
‖w t ‖ = lim ẇ t
t [? ], then
‖w t ‖
lim
t→∞
w t
‖w t ‖ = ∑ ∇ f i (g t ¯w). (6.22)
i∈S
Thus we have shown w satisfies the first-order optimality condition of
Definition 6.4.1.
6.5 Induced bias in function space
≪Suriya notes: Jason: can you introduce the idea of induced biases and give special results for
linear convnets, any relevant results from yours+tengyu’s margin paper, and infinite width 2 layer
ReLU network?≫