Batch Gradient Descent

Anubhav_Gupta · April 20, 2020, 6:23am

please explain the difference between the two:-

Batch Gradient Descent does not scales well with number of training datasets.

Batch Gradient Descent scales well with number of features

Ankur_Sinha · April 20, 2020, 9:24am

Hi Anubhav_Gupta,

Could you please tell me the lecture no. along with the timestamp regarding your confusion?

Thanks!

Anubhav_Gupta · April 20, 2020, 9:43am

machine learning session 13 in batch gradient descent seventh slide

Ankur_Sinha · April 20, 2020, 3:40pm

Hi Anubhav_Gupta,

The Batch Gradient Descent as the name suggests uses the whole batch of training data at every step and so it is very very slow to converge which can lead to global minima and thus, it can take too long for processing all the training samples as a batch on very large training set say like there are millions of training examples. Hence, its not easily scalable.

On the other hand, Gradient Descent scales well with the number of features like training a Linear Regression model when there are hundreds of thousands of features, which does not require so much processing power at once.

I hope it helps.
Thanks!