Cat vs Non Cat Classifier - Constant accuracy

Nishant_Singh · September 13, 2020, 8:39am

Hi,

I need a little guidance regarding the project. I’ve attempted the project using the CNNs but my accuracy is not right. My validation accuracy starts at 82.76 and is stuck on it for all the epochs which I feel is wrong. If someone could take a look at my notebook and point out what I’m doing wrong or what shall I change to improve the model then that’d be great.

Here’s the link to my notebook: Cat vs Non-Cat classifier

Preedesh_M · September 14, 2020, 4:23pm

Hi Nishant,

To start with, I had a similar question posted in the forum and I did not get any reply after 2 weeks. So I have resigned to the fact that this is how it is probably. I am yet to figure out the best way to get queries answered, except ofcourse in the class when these are answered But again, I find it not correct to take up a lot of time of others who are attending as well if it is not on the topic of the session that day.

Anyways coming to the problem you are trying to solve, as this is something I was also trying to solve and I was able to get some headway into this, I think it may possibly be of help. I was able to get 96% accuracy in the validation set and 98% for training (infact it finally became 100% and hence overfitted) and I am seeing 86% when I run test set.

The steps are pretty much the same, but few things on this dataset.

the number of images are too small for a proper learning for the model. And when we split into training and validation, it becomes even more smaller.
And as we split statically, meaning out of 209 images, we are like taking 29 fro Validation and 180 for Training, and in each epoch, the same dataset (train and validation) is being used for training, it is really not learning is what I feel. (I wanted to ask this to someone, but as I have not received answer to my initial question, I am not sure there is a point in asking anyways)
So the change I did are basically two things
Increased the dataset size for training by doing image augmentation. This is to take care of the first point on dataset size I mentioned above. You can do multiple ways (flip, brighten, … etc.), but what I did was I flipped the image and created another 209 images with the flipped set. So my overall size of the dataset for the model to learn became double (418)
Code below -
import random
train_set_x_orig_flip = train_set_x_orig.copy()
train_set_x_orig_flip = np.flip(train_set_x_orig_flip, axis=2)
train_set_y_orig_flip = train_set_y_orig.copy()

train_set_x_orig = np.concatenate((train_set_x_orig_flip, train_set_x_orig_flip), axis=0)
train_set_y_orig = np.concatenate((train_set_y_orig, train_set_y_orig))
While doing model training, to shuffle the train and validations rows a bit, I used kfold around model.fit (to take care of the second issue mentioned above)
code as below -
X = train_set_x_orig_normalized
Y = train_set_y_orig
from sklearn.model_selection import StratifiedKFold
kFold = StratifiedKFold(n_splits=10)

for train,valid in kFold.split(X,Y ):
history = model.fit(X[train], Y[train], epochs=5, validation_data=(X[valid],Y[valid]))

BTW, why are you having the last Dense layer with two nodes? The o/p is just one as it is a Binary classification problem, right? Your y dataset also is just one attribute having values 1 or 0. You may need to use binary_crossentropy as well if you make this change.

Hope this helps. I feel you should get a better result with these changes. The plot will still not show you the convergence you expect as the we are splitting into mini batches I believe.
But the training, validation accuracy and test data evaluation and finally predictions should be much better.

Nishant_Singh · September 15, 2020, 4:25am

Hi Preedesh,

First of all, thank you for taking the time to go through my notebook and pointing out what things need to be implemented and what I was doing wrong. As you suggested, I was also looking into augmenting the data as the dataset was pretty small. I went with just flipping the images and doubling the dataset size as suggested by you.

I also implemented shuffling as you suggested. I had actually missed that. Thanks for pointing that out.
And regarding the 2 neurons in the last layer, I was actually using this model for some other classification problem and I somehow forgot to edit the last layer. Thanks for catching that.

But the thing is even after implementing everything you’ve suggested, I’m still getting constant accuracy and constant loss on the validation set as well as the training set. Could you please take a look at the updated notebook and point out what am I doing wrong? Once again, thank you for taking the time to help me.

Here’s the link for the updated notebook: Cat vs Non-cat - Shuffled