Dylan Patel is the founder and CEO of SemiAnalysis. He joins Big Technology Podcast to explain how generative AI work, explaining the inner workings of tokens, pre-training, fine-tuning, open source, and reasoning. We also cover DeepSeek’s efficiency breakthrough, the race to build colossal AI data centers, and what GPT-5’s hybrid training approach could unlock. Hit play for a masterclass you’ll want to send to every friend puzzled (or excited) about the future of AI.
Chapters:
00:00 - Introduction to Generative AI with Dylan Patel
02:00 - Basics of AI Model Training
04:30 - Understanding Tokens and Word Representation
07:00 - How Models Process Language Patterns
10:00 - Attention Mechanisms and Context Understanding
13:00 - Pre-Training: Learning from Internet Data
16:00 - Loss Minimization and Learning Processes
19:00 - Why GPUs Are Perfect for AI Computation
22:00 - Post-Training and Model Personalities
25:00 - Reasoning: How Modern AI Models Think
28:00 - The Growing Efficiency of AI Models
31:00 - Data Center Build-Outs Despite Increasing Efficiency
34:00 - The Future of GPT-5 and AI Development