Theory of deep learning

Dimensionality reduction, regularization, and generalization in overparameterized regressions

We show PCA avoids the peaking phenomenon of double-descent, and overparameterization may not be necessary for good generalization.