Linear Regression
Main ideas :
- Use Least squares to fit a line to Data
- Use R square
- Use p value
Fitting the Line → Try to minimize a metric that represents the fit
- Let the Line be
- Now, our optimization goal is to find the values of
so that the variation around this line is minimal → We do this by minimizing the squared Errors
To know if taking into the samples actually improves anything or not, all we have to do is calculate the variance around the fit and compare it with variance around the mean of the y values of the point, and give an answer in percentages! This is called the
Thus, if this value is 0.6 , we get a 60% improvement in the variance by taking the x features into account.
Let’s go to the interesting stuff → The Math of this all
Math of Regression
Let’s take the case of a set of multidimensional features
Here, the actual data is a function
Here, I have used bold to represent vector notation. since our model is linear, we can define it as:
- Note: to make this work by taking bias into account we let
where the D weights are corresponding to D features and the extra weight is the bias. Thus, which basically means that our N observations are stacked vertically and each observation is of D dimensions, but to make the notation work, we add a 1 at the start, which will be the multiplier for our bias term, and thus, have D+1 as the dimension of the row.
Thus, our error now becomes
Now, to get our optimal weights we follow the method to get the minima of e i.e differentiate e w.r.t
Hence, all we need to do is plug-in
The essence of regression remains the same. in the case where