TheoryofDeepLearning.2022
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
76 theory of deep learning
Figure 8.2: Generalization error
vs. complexity measure.
that original labels have much better alignment with top eigenvectors,
thus enjoying faster convergence.
Understanding Generalization of Ultra-wide Neural Networks The approximation
in Equation (8.8) implies the final prediction function
of ultra-wide neural network is approximately the kernel prediction
function defined in Equation (8.6). Therefore, we can just use the generalization
theory for kernels to analyze the generalization behavior
of ultra-wide neural networks. For the kernel prediction function
defined in Equation (8.6), we can use Rademacher complexity bound
to derive the following generalization bound for 1-Lipschitz loss
function (which is an upper bound of classification error):
√
2y ⊤ (H ∗ ) −1 y · tr (H ∗ )
. (8.10)
n
This is a data-dependent complexity measure that upper bounds the
generalization error.
We can check this complexity measure empirically. In Figure 8.2,
we compare the generalization error (l 1 loss and classification error)
with this complexity measure. We vary the portion of random labels
in the dataset to see how the generalization error and the complexity
measure change. We use the neural network architecture defined in
Equation (8.7) with ReLU activation function and only train the first
layer. The left figure uses data from two classes of MNIST and the
right figure uses two classes from CIFAR. This complexity measure
almost matches the trend of generalization error as the portion of
random labels increases.
8.4 NTK formula for Multilayer Fully-connected Neural Network
In this section we show case the NTK formulas of fully-connected
neural network. We first define a fully-connected neural net formally.
Let x ∈ R d be the input, and denote g (0) (x) = x and d 0 = d for
notational convenience. We define an L-hidden-layer fully-connected