What are Transformer Neural Networks?

What are Transformer Neural Networks?

167.122 Lượt nghe
What are Transformer Neural Networks?
This short tutorial covers the basics of the Transformer, a neural network architecture designed for handling sequential data in machine learning. Timestamps: 0:00 - Intro 1:18 - Motivation for developing the Transformer 2:44 - Input embeddings (start of encoder walk-through) 3:29 - Attention 6:29 - Multi-head attention 7:55 - Positional encodings 9:59 - Add & norm, feedforward, & stacking encoder layers 11:14 - Masked multi-head attention (start of decoder walk-through) 12:35 - Cross-attention 13:38 - Decoder output & prediction probabilities 14:46 - Complexity analysis 16:00 - Transformers as graph neural networks Original Transformers paper: Attention is All You Need - https://arxiv.org/abs/1706.03762 Other papers mentioned: (GPT-3) Language Models are Few-Shot Learners - https://arxiv.org/abs/2005.14165 (DALL-E) Zero-Shot Text-to-Image Generation - https://arxiv.org/abs/2102.12092 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding - https://arxiv.org/abs/1810.04805 Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity - https://arxiv.org/abs/2101.03961 Finetuning Pretrained Transformers into RNNs - https://arxiv.org/abs/2103.13076 Efficient Transformers: A Survey - https://arxiv.org/abs/2009.06732 Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth - https://arxiv.org/abs/2103.03404 Do Transformer Modifications Transfer Across Implementations and Applications? - https://arxiv.org/abs/2102.11972 Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies - https://ml.jku.at/publications/older/ch7.pdf Transformers are Graph Neural Networks (blog post) - https://thegradient.pub/transformers-are-graph-neural-networks Video style inspired by 3Blue1Brown Music: Trinkets by Vincent Rubinetti Links: YouTube: https://www.youtube.com/ariseffai Twitter: https://twitter.com/ari_seff Homepage: https://www.ariseff.com If you'd like to help support the channel (completely optional), you can donate a cup of coffee via the following: Venmo: https://venmo.com/ariseff PayPal: https://www.paypal.me/ariseff