What are Transformer Models and how do they work?

What are Transformer Models and how do they work?

148.812 Lượt nghe
What are Transformer Models and how do they work?
This is the last of a series of 3 videos where we demystify Transformer models and explain them with visuals and friendly examples. Video 1: The attention mechanism in high level https://www.youtube.com/watch?v=OxCpWwDCDFQ Video 2: The attention mechanism with math https://www.youtube.com/watch?v=UPtG_38Oq8o Video 3 (This one): Transformer models If you like this material, check out LLM University from Cohere! https://llm.university Get the Grokking Machine Learning book! https://manning.com/books/grokking-machine-learning Discount code (40%): serranoyt (Use the discount code on checkout) 00:00 Introduction 01:50 What is a transformer? 04:35 Generating one word at a time 08:59 Sentiment Analysis 13:05 Neural Networks 18:18 Tokenization 19:12 Embeddings 25:06 Positional encoding 27:54 Attention 32:29 Softmax 35:48 Architecture of a Transformer 39:00 Fine-tuning 42:20 Conclusion