The P in GPT - a down-to-earth explainer of gradient descent

2.226 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

The P in GPT - a down-to-earth explainer of gradient descent

You've graciously put up with my endless ramblings about parameters and mixers.

And now I get what you're thinking. "Enough is enough! No more analogies! I want to actually know how it works. If the secret to GPT's power lies with training, then what exactly is training?? Reveal the secret sauce!"

Give me 20 minutes of your time, and I will do just that!

Prepare for hardcore terminology that sounds so SciFi that it belongs in Star Trek:
- Cross-entropy loss
- Successive applications of the chain rule
- Stochastic gradient descent

And yet, we'll cover it in a completely down-to-earth way that doesn't bring back nightmares from high school calculus.

Alas - prepare for some disappointing news at the end. There will still be more to the puzzle, even after this. But we will be most of the way there!

You may even start to taste the ketchup and mayo in the secret sauce.
___

If you'd like to learn how to build with Large Language Models, including fine-tuning your own and coding Agent solutions to solve commercial problems, please take a look at my intensive 8 week course on LLM Engineering:
https://www.udemy.com/course/llm-engineering-master-ai-and-large-language-models/?referralCode=35EB41EBB11DD247CF54

Connect with me on LinkedIn:
https://www.linkedin.com/in/eddonner/
Follow me on X:
https://x.com/edwarddonner					

The P in GPT - a down-to-earth explainer of gradient descent

Nhạc Theo Chủ Đề

Liên kết website