Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained

5.339 Lượt nghe
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization +Tokenizer explained
What is tokenization? Ms. Coffee Bean explains tokenization in general, explains why flexible tokenization is important and then moves onto explaining the "Charformer: Fast Character Transformers via Gradient-based Subword Tokenization" paper (explained and visualized). ➡️ AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ Paper 📄: Tay, Yi, Vinh Tran, Sebastian Ruder, Jai Gupta, Hyung Won Chung, Dara Bahri, Zhen Qin, Simon Baumgartner, Cong Yu and Donald Metzler. "Charformer: Fast Character Transformers via Gradient-based Subword Tokenization." (2021). https://arxiv.org/abs/2106.12672 📺 Replacing self-attention with the Fourier Transform: https://youtu.be/j7pWPdGEfMA 📺 Convolutions instead of self-attention. When is a Transformer not a Transformer anymore? : https://youtu.be/xchDU2VMR4M 📺 Transformer explained: https://youtu.be/FWFA4DGuzSc Outline: 00:00 What are tokenizers good for? 02:49 Where does rigid tokenization fail? 03:51 Charformer: end-to-end tokenization 08:33 Again, but in summary. 09:57 Reducing the sequence length 10:37 Meta-comments on token mixing ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔗 Links: YouTube: https://www.youtube.com/AICoffeeBreak Twitter: https://twitter.com/AICoffeeBreak Reddit: https://www.reddit.com/r/AICoffeeBreak/ #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​