In this video we are going to look into Batch Normalization - the How (forward pass) and Why? In the next video we will look into the backprop derivatives and inference time.
Papers:
* Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Ioffe and Szegedy, 2015)
* How Does Batch Normalization Help Optimization? (Santurkar et al. 2018)
* Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization (Koher et al. 2018)
NN Playlist: https://bit.ly/3PvvYSF
Become a member and get full access to this online course:
https://meerkatstatistics.com/courses/neural-networks-in-python-numpy-and-pytorch/
*** 🎉 Special YouTube 60% Discount on Yearly Plan – valid for the 1st 100 subscribers; Voucher code: First100 🎉 ***
"NN with Python" Course Outline:
*Intro*
* Administration
* Intro - Long
* Notebook - Intro to Python
* Notebook - Intro to PyTorch
*Comparison to other methods*
* Linear Regression vs. Neural Network
* Logistic Regression vs. Neural Network
* GLM vs. Neural Network
*Expressivity / Capacity*
* Hidden Layers: 0 vs. 1 vs. 2+
*Training*
* Backpropagation - Part 1
* Backpropagation - Part 2
* Implement a NN in NumPy
* Notebook - Implementation redo: Classes instead of Functions (NumPy)
* Classification - Softmax and Cross Entropy - Theory
* Classification - Softmax and Cross Entropy - Derivatives
* Notebook - Implementing Classification (NumPy)
*Autodiff*
* Automatic Differentiation
* Forward vs. Reverse mode
*Symmetries in Weight Space*
* Tanh & Permutation Symmetries
* Notebook - Tanh, Permutation, ReLU symmetries
*Generalization*
* Generalization and the Bias-Variance Trade-Off
* Generalization Code
* L2 Regularization / Weight Decay
* DropOut regularization
* Notebook - DropOut (PyTorch)
* Notebook - DropOut (NumPy)
* Notebook - Early Stopping
*Improved Training*
* Weight Initialization - Part 1: What NOT to do
* Notebook - Weight Initialization 1
* Weight Initialization - Part 2: What to do
* Notebook - Weight Initialization 2
* Notebook - TensorBoard
* Learning Rate Decay
* Notebook - Input Normalization
* Batch Normalization - Part 1: Theory
* Batch Normalization - Part 2: Derivatives
* Notebook - BatchNorm (PyTorch)
* Notebook - BatchNorm (NumPy)
*Activation Functions*
* Classical Activations
* ReLU Variants
*Optimizers*
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 1: Theory
* SGD Variants: Momentum, NAG, AdaGrad, RMSprop, AdaDelta, Adam, AdaMax, Nadam - Part 2: Code
*Auto Encoders*
* Variational Auto Encoders
If you’re looking for statistical consultation, work on interesting projects, or training workshop, visit my website https://meerkatstatistics.com/ or contact me directly at
[email protected]
~~~~~ SUPPORT ~~~~~
Paypal me: https://paypal.me/MeerkatStatistics
~~~~~~~~~~~~~~~~~
Intro/Outro Music: Dreamer - by Johny Grimes
https://www.youtube.com/watch?v=CACgsYjeK54