4-Bit Training for Billion-Parameter LLMs? Yes, Really.

4-Bit Training for Billion-Parameter LLMs? Yes, Really.

10.708 Lượt nghe
4-Bit Training for Billion-Parameter LLMs? Yes, Really.
👉 Check out Simplilearn’s SkillUp FREE courses (sponsor): https://www.simplilearn.com/skillup-free-online-courses?utm_campaign=AICoffeeBreak_Description&utm_medium=INFLCR_SkillUP&utm_source=Youtube Video summary: 📺 We all know quantization works at inference time, but researchers successfully trained a 13-billion-parameter LLaMA 2 model using FP4 precision—yes, just 16 values per number! In this video, we explain and break down the paper. Check it out if you want to learn something about quantization and low/mixed-precision training in general! AI Coffee Break Merch! 🛍️ https://aicoffeebreak.creator-spring.com/ 📃 FP4 training: Ruizhe Wang, Yeyun Gong, Xiao Liu, Guoshuai Zhao, Ziyue Yang, Baining Guo, Zhengjun Zha, and Peng Cheng. "Optimizing Large Language Model Training Using FP4 Quantization." (2025) https://arxiv.org/abs/2501.17116 📃FP8 training: Charlie Blake, Constantin Eichenberg, Josef Dean, Lukas Balles, Luke Y. Prince, Björn Deiseroth, Andres Felipe Cruz-Salinas, Carlo Luschi, Samuel Weinbach, and Douglas Orr. "u-$\mu $ P: The Unit-Scaled Maximal Update Parametrization." (2024) https://arxiv.org/abs/2407.17465 Outline: 00:00 Training with FP4 quantization 02:02 Simplilearn (Sponsor) 03:25 Training LLMs in FP4 – Motivation 08:14 Step 1: Quantize the matrix multiplications 10:22 Step 2: Handle the outliers in activations 11:44 Step 3: Make quantization differentiable 13:00 Putting it all together 13:33 Results 14:14 Impact Thanks to our Patrons who support us in Tier 2, 3, 4: 🙏 Vignesh Valliappan, Michael, Sunny Dhiana, Andy Ma ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔥 Optionally, pay us a coffee to help with our Coffee Bean production! ☕ Patreon: https://www.patreon.com/AICoffeeBreak Ko-fi: https://ko-fi.com/aicoffeebreak Join this channel as a Bean Member to get access to perks: https://www.youtube.com/channel/UCobqgqE4i5Kf7wrxRxhToQA/join ▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀ 🔗 Links: AICoffeeBreakQuiz: https://www.youtube.com/c/AICoffeeBreak/community Twitter / X: https://twitter.com/AICoffeeBreak LinkedIn: https://www.linkedin.com/in/letitia-parcalabescu/ Threads: https://www.threads.net/@ai.coffee.break Bluesky: https://bsky.app/profile/aicoffeebreak.bsky.social Reddit: https://www.reddit.com/r/AICoffeeBreak/ YouTube: https://www.youtube.com/AICoffeeBreak Substack: https://aicoffeebreakwl.substack.com/ Web: https://explanationmark.de/letitia https://aicoffeebreak.com #AICoffeeBreak #MsCoffeeBean #MachineLearning #AI #research​ Video editing: Nils Trost