In order to bring down the overfitting or variance, we use the regularization term in our cost function. There are three types of regularizations:
- Ridge
- Lasso
- ElasticNet
The main objective here is to bring the freedom down by putting constraints on the weights. Basically, we force the model to keep the weights as small as possible. This is done by modifying the cost function such that the cost is higher when the weights are higher.
We understand that in the Ridge Regularization, we add summations of squares of all weights while in Lasso, we add the total of absolute values ( e.g. -3.5 => 3.5 and 4.1 => 4.1) of weights to the cost function. The Elastic Net is a combination of both Ridge and Lasso - it has the sum of squares as well as the absolute values of the weights in loss or cost function.
What exactly is the difference in the outcomes of Ridge and Lasso? And Why?