DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

57.076 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Curious how a 1.5B parameter model can solve maths problems better than far larger models? In this video, I demonstrate how DeepSeek R1 leverages lengthy chains of thought to enhance its mathematical reasoning. We take a close look at how DeepSeek R1 prompts are structured and generated according to the R1 paper—then reproduce these chain of thought prompts via the DeepSeek R1 coldstart method and my own maths compiler to create synthetic training data.

I then walk through the entire fine-tuning process, step by step, showing how even a relatively modest model can outperform bulkier rivals using DeepSeek R1’s coldstart technique. If you’re fascinated by AI breakthroughs or simply enjoy seeing a thorough training pipeline, this detailed behind-the-scenes session is for you.

github repo for math compiler: https://github.com/chrishayuk/chuk-math
github repo for verifiers:  https://github.com/chrishayuk/verifiers

00 - intro
10 - DeepSeek R1 Chat
35 - DeepSeek R1 Ollama
44 - Think Tags
04 - Deep Seek R1 paper
45 - Generating synthetic long chains of thought
25 - Translating the CoT to natural language
40 - Self Reflection and Self Correction
50 - Generating sample data
06 - Testing the Qwen2.5-1.5B
52 - Fine Tuning Qwen2.5-1.5B with our Coldstart data
52 - Chatting with our Fine Tuned Model
55 - Conclusion					

DeepSeek R1 Coldstart: How to TRAIN a 1.5B Model to REASON

Nhạc Theo Chủ Đề

Liên kết website