Hi CloudX Lab Team,
Another clarification is required wrt Overfitting of Adaboost Ensemble ML Algorithm.
In case, if there is overfitting, we need to adopt the Regularization method.
Now based on this Regularization, we need to adopt either one of the 2 strategies viz.:
a) To reduce the number of Estimators (i.e. the number of Trees in the Forest). In other words, we need to reduce the number of Variables from the Sampling Dataset.
OR—
b) To to regularize the Base Estimator.
- Now this Base Estimator contains the Base Decision Node.
- Now this Base Decision Node lies close to the Root of the Tree.
- This Base Decision Node comprises of Important Features that lie close to the root of the tree (as stated in Slide 71).
- While growing the tree(s) in the forest, in order to split the nodes either of the two methodologies/strategies can be adopted viz.:
a) Random subset of Features/Predictors/Variables OR
b) Random threshold values for each feature
So technically is this the appropriate way of describing the Overfitting of AdaBoost Ensemble ML Algorithm & how to resolve it???
Kindly correct me (each & every step), if my though process is in a wrong direction.
Sincerely looking forward to your valuable inputs.