CNN model is not getting trained with all training sample on Kaggle

I’ve used LeNet-5 CNN architecture to train a dataset of ~40k traffic sign images available on Kaggle. Following is the input shape of given data.
Data - (39209, 32, 32, 3) (39209, )

I’ve split Input data into to 4:1 ratio to create training data and validation data,
X_train - (31367, 32, 32, 3)
X_valid - (7842, 32, 32, 3)
y_train - (31367, )
y_valid - (7842, )

Now, when I train LeNet-5 CNN model on Kaggle, it considers 981 training sample. While it should take 31367.

981/981 [==============================] - 5s 5ms/step - loss: 1.2600 - accuracy: 0.6402 - val_loss: 0.1751 - val_accuracy: 0.9484
Epoch 2/15
981/981 [==============================] - 5s 5ms/step - loss: 0.2318 - accuracy: 0.9288 - val_loss: 0.0850 - val_accuracy: 0.9753
Epoch 3/15
981/981 [==============================] - 5s 5ms/step - loss: 0.1386 - accuracy: 0.9570 - val_loss: 0.0494 - val_accuracy: 0.9864
Epoch 4/15
981/981 [==============================] - 4s 4ms/step - loss: 0.1045 - accuracy: 0.9692 - val_loss: 0.0403 - val_accuracy: 0.9909
Epoch 5/15
981/981 [==============================] - 4s 4ms/step - loss: 0.0894 - accuracy: 0.9734 - val_loss: 0.0281 - val_accuracy: 0.9939
Epoch 6/15
981/981 [==============================] - 4s 4ms/step - loss: 0.0788 - accuracy: 0.9769 - val_loss: 0.0256 - val_accuracy: 0.9929
Epoch 7/15
981/981 [==============================] - 4s 4ms/step - loss: 0.0736 - accuracy: 0.9788 - val_loss: 0.0253 - val_accuracy: 0.9940

Is there any constraint on Kaggle to train a model of large sample? Or anything I’m missing here. Below is the code for your reference.

model = keras.models.Sequential([
keras.layers.Conv2D(filters=6, kernel_size=(5,5), strides=1, activation=‘tanh’, input_shape=(32,32,3)),
keras.layers.AveragePooling2D(pool_size=(2,2), strides=2),
keras.layers.Conv2D(filters=16, kernel_size=(5,5), strides=1, activation=‘tanh’),
keras.layers.AveragePooling2D(pool_size=(2,2), strides=2),
keras.layers.Conv2D(filters=120, kernel_size=(5, 5), activation=‘tanh’),
keras.layers.Flatten(),
keras.layers.Dense(units=84, activation=‘tanh’),
keras.layers.Dense(units=43, activation=‘softmax’),
])

model.compile(loss=‘categorical_crossentropy’, optimizer=‘adam’, metrics=[‘accuracy’])

history = model.fit(X_train, y_train, batch_size=32, epochs=15, validation_data=(X_valid, y_valid))

Actually, since you have a batch size of 32 and total records are 31367, the number of batches are 31367/32. = 980.21875 ~ 981.

Hi Sandeep,

Thanks for your response.

I’ve observed a different case for Fashion MNIST dataset in “mnist_with_anns.ipynb” at cloudxlab. During training the model, despite batch_size = 20 the sample is shown 55k while it should be 55000/20 = 2750. Can you please help to understand this case.

Thanks!