Gradient Boost Part 4 (of 4): Classification Details

142.024 Lượt nghe

00:00

Update Required To play the media you will need to either update your browser to a recent version or update your Flash plugin.

Tải MP3

MÔ TẢ MP3TIẾP THEO

Gradient Boost Part 4 (of 4): Classification Details

At last, part 4 in our series of videos on Gradient Boost. This time we dive deep into the details of how it is used for classification, going through algorithm, and the math behind it, one step at a time. Specifically, we derive the loss function from the log(likelihood) of the data and we derive the functions used to calculate the output values from the leaves in each tree. This one is long, but well worth if you want to know how Gradient Boost works.

NOTE: There is a minor error at 7:01. It should just say log(p) - log(1-p) = log(p/(1-p)). And at 19:10 I forgot to put "L" in front of some of the loss functions. However, it should be clear what they are since I point to them say, "This is the loss function".

This StatQuest assumes that you have already watched Parts 1, 2 and 3 in this series:
Part 1, Regression Main Ideas: https://youtu.be/3CC4N4z3GJc
Part 2, Regression Details: https://youtu.be/2xudPOBz-vs
Part 3, Classification Main Ideas: https://youtu.be/jxuNLH5dXCs

...and it also assumed that you understand odds, the log(odds) and Logistic Regression pretty well. Here are the links for...

The odds: https://youtu.be/ARfXDSkQf1Y

A general overview of Logistic Regression: https://youtu.be/yIYKR4sgzI8
how to interpret the coefficients: https://youtu.be/vN5cNN2-HWE
and how to estimate the coefficients: https://youtu.be/BfKanl1aSG0

Lastly, if you want to learn more about using different probability thresholds for classification, check out the StatQuest on ROC and AUC: https://youtu.be/xugjARegisk

This StatQuest is based on the following sources:

A 1999 manuscript by Jerome Friedman that introduced Stochastic Gradient Boost: https://jerryfriedman.su.domains/ftp/stobst.pdf

The Wikipedia article on Gradient Boosting: https://en.wikipedia.org/wiki/Gradient_boosting

The scikit-learn implementation of Gradient Boosting: https://scikit-learn.org/stable/modules/ensemble.html#gradient-boosting

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

If you'd like to support StatQuest, please consider...

Patreon: https://www.patreon.com/statquest
...or...
YouTube Membership: https://www.youtube.com/channel/UCtYLUTtgS3k1Fg4y5tAhLbw/join

...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
https://twitter.com/joshuastarmer

Corrections:
6:58 log(p) - log(1-p) is not equal to log(p)/log(1-p) but equal to log(p/(1-p)). In other words, the result at 7:07,  log(p) - log(1-p) = log(odds), is correct, and thus, the error does not propagate beyond it's short, but embarrassing moment. 
26:53, my indexing of the variables gets off. This is unfortunate, but you should still be able to follow the concepts.

#statquest #gradientboost					

Gradient Boost Part 4 (of 4): Classification Details

Nhạc Theo Chủ Đề

Liên kết website