Build an LLM from Scratch 5: Pretraining on Unlabeled Data

9.239 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Build an LLM from Scratch 5: Pretraining on Unlabeled Data

Links to the book: 
- https://amzn.to/4fqvn0D (Amazon)
- https://mng.bz/M96o (Manning)

Link to the GitHub repository: https://github.com/rasbt/LLMs-from-scratch

This video explains how to pretrain a LLM from scratch.

00:00 5.1.1 Using GPT to generate text
17:23 5.1.2 Calculating the text generation loss: cross-entropy and perplexity
44:43 5.1.3 Calculating the training and validation set losses
1:10:27 5.2 Training an LLM
1:36:36 5.3 Decoding strategies to control randomness
1:39:51 5.3.1 Temperature scaling
1:53:24 5.3.2 Top-k sampling
2:01:40 5.3.3 Modifying the text generation function
2:12:24 5.4 Loading and saving model weights in PyTorch
2:16:44 5.5 Loading pretrained weights from OpenAI

You can find additional bonus materials on GitHub:

Pretraining GPT on the Project Gutenberg Dataset, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/03_bonus_pretraining_on_gutenberg

PyTorch Performance Tips for Faster LLM Training, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/10_llm-training-speed

Converting the GPT-2 architecture into Llama 2 and Llama 3, https://github.com/rasbt/LLMs-from-scratch/tree/main/ch05/07_gpt_to_llama					

Build an LLM from Scratch 5: Pretraining on Unlabeled Data

Nhạc Theo Chủ Đề

Liên kết website