Head to https://squarespace.com/artem to save 10% off your first purchase of a website or domain using code ARTEMKIRSANOV
Socials:
X/Twitter: https://x.com/ArtemKRSV
Patreon: https://patreon.com/artemkirsanov
My name is Artem, I'm a graduate student at NYU Center for Neural Science and researcher at Flatiron Institute.
In this video we dive deep into a probabilistic interpretation behind the core linear regression algorithm from the ground up. We talk about how the least squares objective naturally arises when we try to maximize the probability of observed data under the model, and how the square is a result of assuming Gaussian distribution of the noise in the samples. We also explore how incorporating prior beliefs about the distribution of model parameters leads to different kinds of regularization in objective functions.
Outline:
00:00 Introduction
01:16 What is Regression
02:11 Fitting noise in a linear model
06:02 Deriving Least Squares
07:46 Sponsor: Squarespace
09:04 Incorporating Priors
12:06 L2 regularization as Gaussian Prior
14:30 L1 regularization as Laplace Prior
16:16 Putting all together