Build an LLM from Scratch 5: Pretraining on Unlabeled Data

Build an LLM from Scratch 5: Pretraining on Unlabeled Data

9.239 Lượt nghe
Build an LLM from Scratch 5: Pretraining on Unlabeled Data
Links to the book: - https://amzn.to/4fqvn0D (Amazon) - https://mng.bz/M96o (Manning) Link to the GitHub repository: https://github.com/rasbt/LLMs-from-scratch This video explains how to pretrain a LLM from scratch. 00:00 5.1.1 Using GPT to generate text 17:23 5.1.2 Calculating the text generation loss: cross-entropy and perplexity 44:43 5.1.3 Calculating the training and validation set losses 1:10:27 5.2 Training an LLM 1:36:36 5.3 Decoding strategies to control randomness 1:39:51 5.3.1 Temperature scaling 1:53:24 5.3.2 Top-k sampling 2:01:40 5.3.3 Modifying the text generation function 2:12:24 5.4 Loading and saving model weights in PyTorch 2:16:44 5.5 Loading pretrained weights from OpenAI You can find additional bonus materials on GitHub: Pretraining GPT on the Project Gutenberg Dataset, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/03_bonus_pretraining_on_gutenberg PyTorch Performance Tips for Faster LLM Training, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/10_llm-training-speed Converting the GPT-2 architecture into Llama 2 and Llama 3, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/07_gpt_to_llama