Pca question doubt urgent

Queen_Saikia · October 27, 2020, 1:18pm

Q.No1- Prove that samples in the low-dimensional principal subspace are
uncorrelated.

Q.No2- Consider N, d-dimensional data points. Derive the expression for
Variance/Covariance matrix Sdxd(X) in terms of Variance/Covariance matrix
SNxN(XT). Where d>>N.

What will be the solution to these questions?

Ujjwal_Sharma · October 30, 2020, 11:32am

Try performing operations on the correlation matrix.

sgiri · October 30, 2020, 12:38pm

The way PCA works is by eliminating the dimensions that are redundant first. So, it comes up with the new dimensions which are less correlated.

Therefore, after PCA and removal of some of the dimensions, the low-dimensional principal subspaces would have low correlation with each other.

sgiri · October 30, 2020, 12:45pm

I am not entirely clear about the question but the way we compute the covariance is very much like variance.

For N points which have d-dimensions, follow this process to find the covariance

Find the mean values for each of the d dimensions.
Subtract from each row the mean of columns.
Form pairs of each column with other column and multiple the values and sum over N data points
This matrix will be the covariance matrix

Here is the code:

import numpy as np

A = np.array([
    [90, 60, 90],
    [90, 90, 30],   
    [60, 60, 60],
    [60, 60, 90],
    [30, 30, 30],
])

1. Find the mean values for each of the d dimensions.
AM = np.mean(A, axis=0)
AM

array([66., 60., 60.])


2. Subtract from each row the mean of columns.
AD = A - AM
AD

array([[ 24.,   0.,  30.],
       [ 24.,  30., -30.],
       [ -6.,   0.,   0.],
       [ -6.,   0.,  30.],
       [-36., -30., -30.]])

ADT = AD.T

ADT

array([[ 24.,  24.,  -6.,  -6., -36.],
       [  0.,  30.,   0.,   0., -30.],
       [ 30., -30.,   0.,  30., -30.]])


Form pairs of each column with other column and multiple the values and sum over N data points
This matrix will be the covariance matrix

COV = np.dot(ADT, AD)/len(A)

COV

array([[504., 360., 180.],
       [360., 360.,   0.],
       [180.,   0., 720.]])