class MostFrequentImputer (BaseEstimator, TransformerMixin):
def fit(self, X, y=None):
self.most_frequent_ = pd.Series([X[c].value_counts().index[0] for c in X],
index=X.columns)
return self
def transform(self, X, y=None):
return X.fillna(self.most_frequent_)
Create an imputer for string categorical columns
This is the 10th exercise in the titanic project. Can somebody explain what this code is doing and hos its working?