What measure of average is considered for the target average value in Decision Trees by default?
It is mean.
Why is the mean value considered and not median, since median is a better measure of center in the presence of outliers?
This is because of the following reasons:
- Calculating median requires the data to be sorted, and so the process becomes more intensive.
- The cases where median is a better measure of center is when the data has skewed data points.
However, in Decision Trees the data points that are part of the leaf are most likely to not have the outlier because the outlier would land in other leaf.