Reza Shabani - How Replit Trained Their Own LLMs (LLM Bootcamp)

Reza Shabani - How Replit Trained Their Own LLMs (LLM Bootcamp)

11.246 Lượt nghe
Reza Shabani - How Replit Trained Their Own LLMs (LLM Bootcamp)
New course announcement ✨ We're teaching an in-person LLM bootcamp in the SF Bay Area on November 14, 2023. Come join us if you want to see the most up-to-date materials building LLM-powered products and learn in a hands-on environment. https://www.scale.bythebay.io/llm-workshop Hope to see some of you there! --------------------------------------------------------------------------------------------- In this video, Reza Shabani of replit walks through the process of training your own LLM, from data processing to deployment. Download the slides and read the talk summary here: https://fullstackdeeplearning.com/llm-bootcamp/spring-2023/shabani-train-your-own Watch the rest of the LLM Bootcamp videos here: https://www.youtube.com/playlist?list=PL1T8fO7ArWleyIqOy37OVXsP4hFXymdOZ Outro music made with Riffusion: https://github.com/riffusion/riffusion 00:00 Why train your own LLMs? 04:44 The Modern LLM Stack 07:24 Data Pipelines: Databricks & Hugging Face 13:34 Preprocessing 16:29 Tokenizer Training 19:57 Running Training: MosaicML, Weights & Biases 22:41 Testing & Evaluation: HumanEval, Hugging Face 26:33 Deployment: FasterTransformer, Triton Server, k8s 27:55 Lessons learned: data-centrism, eval, and collaboration 30:12 What makes a good LLM engineer?