End-to-End Machine Learning Project Part-2A

I did not understand the whole program.

End-to-End Machine Learning Project Part-2

To make this notebook’s output identical at every run

np.random.seed(42)

np.random.permutation(5)

def split_train_set(data,test_ratio):

shuffled_indices=np.random.permutation(len(data))

test_set_size=int(len(data)*test_ratio)

test_indices=shuffled_indices[:test_set_size]
train_indices=shuffled_indices[test_set_size]

return data.iloc[train_indices],data.iloc[test_indices]

train_set,test_set = split_train_test(housing,0.2)

print(len(train_set),"train+",len(test_set),"test")

Please tell me the steps.

Please reply.

Basically, here we have tried to create our own method to split the data into training and test set.

Please try to experiment with each of the methods separately.

np.random.permutation(num) generates all the number from 0 upto num-1 in ramdom order.

image

Picks first few records to test_indices and rest into train_indices.

Picks the records from data at the train_indices and at test_indices and returns a tuple containing the two.

i did not understand.

“returns a tuple containing the two.”

data.iloc[train_indices],data.iloc[test_indices]

How can i return a tuple?

Please tell me the steps.

Please reply.