===== Likes: 16 👍: Dislikes: 0 👎: 100.0% : Updated on 01-21-2023
11:57:17 EST =====
Transformers Explained! This architecture came from this amazing paper: Attention is all you need. Link here: https://arxiv.org/abs/1706.03762
Parts of this architecture is used for state-of-the-art technologies such as GPT-3 and variations of BERT. So, you must know what a transformer model is if you want to dive further into the more advanced methods since they all build upon the principles of the transformer model!
I explain what you NEED to know and nothing more!
Feel free to support me! Do know that just viewing my content is plenty of support! 😍
☕Consider supporting me! https://ko-fi.com/spencerpao ☕
Watch Next?
Named Entity Recognition →
https://youtu.be/4pCB1lZrBcQ
Text Cleaning and Preprocessing →
https://youtu.be/ZucclQNVBlo
🔗 My Links 🔗
Github: https://github.com/SpencerPao/spencerpao.github.io
My Website: https://spencerpao.github.io/
Notebook: https://github.com/SpencerPao/Natural-Language-Processing/tree/main/Transformers
📓 Requirements 🧐
Python
Jupyter notebok
⌛ Timeline ⌛
0:00 - What is a Transformer? Additional Resources
1:41 - Why use a Transformer architecture?
3:17 - Encoder Block
7:52 - Decoder Block
10:20 - Multi-head Attention Mechanisms Explained Further
🏷️Tags🏷️:
Python,Transformers,Transformer,Machine Learning, Deep Learning, Encoder, Decoder, Attention, multi-head attention, embedding, Natural Language Processing, natural, language, processing, word, token, Normalization,Positional encoding,layers,softmax,output,probabilities,feed-forward network,network,tutorial,how to,instruction,gpt-3,building block,BERT
🔔Current Subs🔔:
2,906