Links to the book:
- https://amzn.to/4fqvn0D (Amazon)
- https://mng.bz/M96o (Manning)
Link to the GitHub repository: https://github.com/rasbt/LLMs-from-scratch
This video explains how to pretrain a LLM from scratch.
00:00 5.1.1 Using GPT to generate text
17:23 5.1.2 Calculating the text generation loss: cross-entropy and perplexity
44:43 5.1.3 Calculating the training and validation set losses
1:10:27 5.2 Training an LLM
1:36:36 5.3 Decoding strategies to control randomness
1:39:51 5.3.1 Temperature scaling
1:53:24 5.3.2 Top-k sampling
2:01:40 5.3.3 Modifying the text generation function
2:12:24 5.4 Loading and saving model weights in PyTorch
2:16:44 5.5 Loading pretrained weights from OpenAI
You can find additional bonus materials on GitHub:
Pretraining GPT on the Project Gutenberg Dataset, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/03_bonus_pretraining_on_gutenberg
PyTorch Performance Tips for Faster LLM Training, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/10_llm-training-speed
Converting the GPT-2 architecture into Llama 2 and Llama 3, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/07_gpt_to_llama