Fixing bugs in Gemma, Llama, & Phi 3: Daniel Han

Fixing bugs in Gemma, Llama, & Phi 3: Daniel Han

2.893 Lượt nghe
Fixing bugs in Gemma, Llama, & Phi 3: Daniel Han
The story behind our 8 bug fixes for Gemma, multiple tokenization fixes for Llama 3, a sliding window bug fix and Mistral-fying Phi-3, and learn about how we analyse and find and fix bugs in open source models. Learn also how we make finetuning 2x faster for all these models Recorded live in San Francisco at the AI Engineer World's Fair. See the full schedule of talks at https://www.ai.engineer/worldsfair/2024/schedule & join us at the AI Engineer World's Fair in 2025! Get your tickets today at https://ai.engineer/2025 About Daniel Hey I'm Daniel, the algos guy behind Unsloth. I love making LLM training go fast! We're the guys who fixed 8 of Google's Gemma bugs, a 2048 SWA Phi-3 issue, found tokenization issues and fixed untrained tokens with Llama-3, and I run Unsloth with my brother Michael! Our open source package makes finetuning of LLMs 2x faster and uses 70% less VRAM with no accuracy degradation. I used to work at NVIDIA making GPU algos go fast and helped NASA engineers process data from a Mars rover faster!