A New Physics-Inspired Theory of Deep Learning | Optimal initialization of Neural Nets

13.305 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

A New Physics-Inspired Theory of Deep Learning | Optimal initialization of Neural Nets

A special video about recent exciting developments in mathematical deep learning! 🔥 Make sure to check out the video if you want a quick visual summary over contents of the “The principles of deep learning theory” book https://deeplearningtheory.com/.

SPONSOR: Aleph Alpha 👉 https://app.aleph-alpha.com/ 

17:38 ERRATUM: Boris Hanin reached out to us and made this point "I found the explanations to be crisp and concise, except for one point. Namely, I am pretty sure the description you give of why MLPs become linear models at infinite width is not quite correct. It is not true that they are equivalent to a random feature model in which features are the post-activations of the final hidden layer and that activations in previous layers don’t move. Instead, what happens is that the full vector of activations in each layer moves by an order 1 amount. However, while the Jacobian of the model output with respect to its parameters remains order 1 the Hessian goes to zero. Put another way, the whole neural network can be replaced by its linearization around the start of training. In the resulting linear model all parameters move to fit the data.".

Check out our daily #MachineLearning Quiz Questions: https://www.youtube.com/c/AICoffeeBreak/community
➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/

📕 The book: Roberts, Daniel A., Sho Yaida, and Boris Hanin. The principles of deep learning theory. Cambridge University Press, 2022. https://arxiv.org/abs/2106.10165
MAGMA paper 📜: https://arxiv.org/abs/2112.05253

Outline:
00:00 The Principles of Deep Learning Theory (Book)
02:12 Neural networks and black boxes
05:35 Large-width limit
07:59 How to get the large-width limit and Forward propagation recap
13:11 Why we need non-Gaussianity
16:28 No wiring for infinite-width networks
17:13 No representation learning for infinite-width networks
19:31 Layer recursion
22:36 Experimental verification
24:09 The Renormalisation Group
26:08 Fixed points
28:45 Stability
31:15 Experimental verification (activation functions)
34:57 Outro and thanks
35:26 Sponsor: Aleph Alpha


Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏 
Don Rosenthal, Dres. Trost GbR, banana.dev -- Kyle Morris, Julián Salazar, Edvard Grødem, Vignesh Valliappan, Kevin Tsai, Mutual Information, Mike Ton

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀
🔥 Optionally, pay us a coffee to help with our Coffee Bean production!  ☕
Patreon: https://www.patreon.com/AICoffeeBreak
Ko-fi: https://ko-fi.com/aicoffeebreak
▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🔗 Links:
AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community
Twitter: https://twitter.com/AICoffeeBreak
Reddit: https://www.reddit.com/r/AICoffeeBreak/
YouTube: https://www.youtube.com/AICoffeeBreak

#AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​

Music 🎵 :  It's Only Worth It if You Work for It (Instrumental) - NEFFEX					

A New Physics-Inspired Theory of Deep Learning | Optimal initialization of Neural Nets

Nhạc Theo Chủ Đề

Liên kết website

A New Physics-Inspired Theory of Deep Learning | Optimal initialization of Neural Nets

Những bài liên quan

Chưa có bài liên quan nào!