#LLMs #TransformerArchitecture #AttentionMechanism
http://www.superdatascience.com/llmcourse
Attention and transformers in LLMs, the five stages of data processing, and a brand-new Large Language Models A-Z course: Kirill Eremenko joins host @JonKrohnLearns to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI.
This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://bit.ly/hpeyt), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://jonkrohn.com/podcast for sponsorship information.
In this episode you will learn:
• [
00:00:00] Introduction
• [
00:06:58] Supply and demand in AI recruitment
• [
00:14:06] Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z”
• [
00:18:14] The learning difficulty in understanding LLMs
• [
00:20:28] The basics of LLMs
• [
00:34:58] The five building blocks of transformer architecture
• [
00:42:38] 1: Input embedding
• [
00:49:13] 2: Positional encoding
• [
00:52:32] 3: Attention mechanism
• [
01:14:44] 4: Feedforward neural network
• [
01:17:43] 5: Linear transformation and softmax
• [
01:27:39] Inference vs training time
• [
01:47:49] Why transformers are so powerful
Additional materials: https://www.superdatascience.com/747