Doubt in cost function -training models

SOUMYADEEP_BANIK_CHO · June 20, 2020, 2:39pm

Q.What is the difference between cost function and loss function?

satyajit_das · June 22, 2020, 2:49pm

“Loss function” is the loss for a single training set as compared to the “cost function” which is the square root of mean of squares MSE for all the training sets taken together.

sgiri · June 22, 2020, 4:51pm

Hi Soumyadeep,

Actually, all of the three terms loss function, objective function and cost function are used interchangeably usually.

From “Deep Learning” book by Ian Goodfellow, Yoshua Bengio, Aaron Courville in section 4.3:

“The function we want to minimize or maximize is called the objective function, or criterion. When we are minimizing it, we may also call it the cost function, loss function, or error function. In this book, we use these terms interchangeably, though some machine learning publications assign special meaning to some of these terms.”

As per this book, at least, loss and cost are the same.

ss7dec · June 22, 2020, 9:07pm

Yeah this point has been explained in one of the videos during the live Question-Answer Session , but don’t recall it at this moment. However, I believe that this topic has been lucidly explained in the 1st video recording of Training Models by the Faculty Trainer - Sandeep Sir…
Following is the reply for your doubts: —
Note that Cost/Loss/Criterion function are one and the same. There is no difference between these terminologies.
These terms are used inter-changeably
This is determined by how well the algorithm performs on the given training model or the training dataset.
The final outcome is that cost-function needs to be minimized i.e. y^ should be closer to y (i.e. predicted value should be close to actual value) i.e. y - y^ (Actual Value - Predicted Value).
Looking at the concepts explained explained prior to Training Models topic, SGD (Stochastic Gradient Descent) is basically a Cost-function
SGD - It is basically an Optimization Algorithm technique used for Training the datasets. In other words, when we are training a dataset, SGD works from behind the scenes.
Furthermore, SGD is used for minimizing the Cost Function. Cost-Function is basically the Average Outcome of the Errors calculated from the given model. or dataset sample.
RSME( (Root Square Mean Error), MAE (Mean Average Error) are some of the other Cost Functions used in ML apart from SGD technique.