747: Technical Intro to Transformers and LLMs — with Kirill Eremenko

747: Technical Intro to Transformers and LLMs — with Kirill Eremenko

15.858 Lượt nghe
747: Technical Intro to Transformers and LLMs — with Kirill Eremenko
#LLMs #TransformerArchitecture #AttentionMechanism http://www.superdatascience.com/llmcourse Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host @JonKrohnLearns to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI. This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://bit.ly/hpeyt), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://jonkrohn.com/podcast for sponsorship information. In this episode you will learn: • [00:00:00] Introduction • [00:06:58] Supply and demand in AI recruitment • [00:14:06] Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” • [00:18:14] The learning difficulty in understanding LLMs • [00:20:28] The basics of LLMs • [00:34:58] The five building blocks of transformer architecture • [00:42:38] 1: Input embedding • [00:49:13] 2: Positional encoding • [00:52:32] 3: Attention mechanism • [01:14:44] 4: Feedforward neural network • [01:17:43] 5: Linear transformation and softmax • [01:27:39] Inference vs training time • [01:47:49] Why transformers are so powerful Additional materials: https://www.superdatascience.com/747