Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)

Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)

7.522 Lượt nghe
Inspecting Neural Networks with CCA - A Gentle Intro (Explainable AI for Deep Learning)
Canonical Correlation Analysis is one of the methods used to explore deep neural networks. Methods like CKA and SVCCA reveal to us insights into how a neural network processes its inputs. This is often done by using CKA and SVCCA as a similarity measure for different activation matrices. In this video, we look at a number of papers that compare different neural networks together. We also look at papers that compare the representations of the various layers of a neural network. Contents: Introduction (0:00) Correlation (0:54) How CCA is used to compare representations (2:50) SVCCA and Computer Vision models (4:40) Examining NLP language models with SVCCA: LSTM (9:01) PWCCA - Projection Weighted Canonical Correlation Analysis (10:22) How multilingual BERT represents different languages (10:43) CKA: Centered Kernel Alignment (15:25) BERT, GPT2, ELMo similarity analysis with CKA (16:07) Convnets, Resnets, deep nets and wide nets (17:35) Conclusion (18:59) Explainable AI Cheat Sheet: https://ex.pegg.io/ 1) Explainable AI Intro : https://www.youtube.com/watch?v=Yg3q5x7yDeM&t=0s 2) Neural Activations & Dataset Examples https://www.youtube.com/watch?v=y0-ISRhL4Ks 3) Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning) https://www.youtube.com/watch?v=HJn-OTNLnoE ----- Papers: SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability https://arxiv.org/pdf/1706.05806.pdf Understanding Learning Dynamics Of Language Models with SVCCA https://arxiv.org/pdf/1811.00225.pdf Insights on representational similarity in neural networks with canonical correlation https://arxiv.org/pdf/1806.05759.pdf BERT is Not an Interlingua and the Bias of Tokenization https://www.aclweb.org/anthology/D19-6106.pdf Similarity of Neural Network Representations Revisited http://proceedings.mlr.press/v97/kornblith19a/kornblith19a.pdf Similarity Analysis of Contextual Word Representation Models https://arxiv.org/pdf/2005.01172.pdf Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth https://arxiv.org/pdf/2010.15327.pdf ----- Twitter: https://twitter.com/JayAlammar Blog: https://jalammar.github.io/ Mailing List: https://jayalammar.substack.com/ ------ More videos by Jay: The Narrated Transformer Language Model https://youtu.be/-QH8fRhqFHM Jay's Visual Intro to AI https://www.youtube.com/watch?v=mSTCzNgDJy4 How GPT-3 Works - Easily Explained with Animations https://www.youtube.com/watch?v=MQnJZuBGmSQ Up and Down the Ladder of Abstraction [interactive article by Bret Victor, 2011] https://www.youtube.com/watch?v=1S6zFOzee78 The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) [2015 article] https://www.youtube.com/watch?v=o9LEWynwr6g