New course announcement ✨
We're teaching an in-person LLM bootcamp in the SF Bay Area on November 14, 2023. Come join us if you want to see the most up-to-date materials building LLM-powered products and learn in a hands-on environment.
https://www.scale.bythebay.io/llm-workshop
Hope to see some of you there!
--------------------------------------------------------------------------------------------- In this video, Reza Shabani of replit walks through the process of training your own LLM, from data processing to deployment.
Download the slides and read the talk summary here: https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/shabani-train-your-own
Watch the rest of the LLM Bootcamp videos here: https://www.youtube.com/playlist?list=PL1T8fO7ArWleyIqOy37OVXsP4hFXymdOZ
Outro music made with Riffusion: https://github.com/riffusion/riffusion
00:00 Why train your own LLMs?
04:44 The Modern LLM Stack
07:24 Data Pipelines: Databricks & Hugging Face
13:34 Preprocessing
16:29 Tokenizer Training
19:57 Running Training: MosaicML, Weights & Biases
22:41 Testing & Evaluation: HumanEval, Hugging Face
26:33 Deployment: FasterTransformer, Triton Server, k8s
27:55 Lessons learned: data-centrism, eval, and collaboration
30:12 What makes a good LLM engineer?