Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)

1.023.861 Lượt nghe
Stanford CS229 I Machine Learning I Building Large Language Models (LLMs)
For more information about Stanford's Artificial Intelligence programs visit: https://stanford.io/ai This lecture provides a concise overview of building a ChatGPT-like model, covering both pretraining (language modeling) and post-training (SFT/RLHF). For each component, it explores common practices in data collection, algorithms, and evaluation methods. This guest lecture was delivered by Yann Dubois in Stanford’s CS229: Machine Learning course, in Summer 2024. Yann Dubois PhD Student at Stanford https://yanndubs.github.io/ About the speaker: Yann Dubois is a fourth-year CS PhD student advised by Percy Liang and Tatsu Hashimoto. His research focuses on improving the effectiveness of AI when resources are scarce. Most recently, he has been part of the Alpaca team, working on training and evaluating language models more efficiently using other LLMs. To view all online courses and programs offered by Stanford, visit: http://online.stanford.edu Chapters: 00:00 - Introduction 00:10 - Recap on LLMs 00:16 - Definition of LLMs 00:19 - Examples of LLMs 01:16 - Importance of Data 01:20 - Evaluation Metrics 01:33 - Systems Component 01:41 - Importance of Systems 01:47 - LLMs Based on Transformers 01:57 - Focus on Key Topics 02:00 - Transition to Pretraining 03:02 - Overview of Language Modeling 04:17 - Generative Models Explained 05:15 - Autoregressive Models Definition 06:36 - Autoregressive Task Explanation 07:49 - Training Overview 08:48 - Tokenization Importance 10:50 - Tokenization Process 13:30 - Example of Tokenization 16:00 - Evaluation with Perplexity 20:50 - Current Evaluation Methods 24:30 - Academic Benchmark: MMLU