Just wanted to have a better understanding of Dimensionality Reduction ML from Real-Time perspective.
Dimensionality Reduction ML Technique can also be used for Data Analytics apart from Data Visualization techniques. Let me quote an example as follows—
1st illustration/Real-Time Example —
Take an Organization having headcount of 250 employees. Details about these Employees are stored in 4 different locations or tables. These 4 tables can be explicitly described as follows viz.:
a) Admin Department - Employee Data containing Demographic Details
b) Finance Department - Employee Data containing Salary Details, Tax Liability details
c) Operations Team - Employee Data containing their Daily Productivity Targets Achieved, Pending Targets
d) Quality Team - Quality Adherence Percentage, % of Errors, Classification of Errors using 5 * 5 Risk Matrix etc.
e) HR Team - Roll-out of Monthly Incentives, Bonus, Insurance, PF, Gratuity, Recruitments, Employee Resignations, Leaves availed on Monthly/Quarterly/Yearly Basis and other similar data.
[i] With such diverse information available regarding the employee(s) from various perspectives, can we call this as Multidimensional Data???
ii) As we have huge quantities of data for 250 employees, stored in different browsers, in 4 - 5 different tables, this would definitely be an apt case for the application of SQL techniques for performing Horizontal Joins & Vertical Joins. Is this concept correct???
iii) Also as there are different Teams like HR, Admin, Logistics, IT, Finance etc. Not all the Departments are eligible for evaluation based on Quality, Production, Incentives etc. factors, therefore leading to sparsity in instance based data across various features — an important concept while dealing with Dimensionality Reduction ML Techniques. Is this understanding correct???
v) Can this aforesaid example or illustration be used for explanation of Dimensionality Reduction Techniques - PCA, Factor Analysis etc. techniques?
I may be wrong in my understanding. But want to get the facts factually correct—so that this algorithm (i.e. Dimensionality Reduction technique) can be applied in Real-time?
2nd illustration/Real-Time Example —
A country’s GDP (Gross Domestic Product) is dependent on various factors -
a) Internal factors within that country and
b) External factors i.e. Global Affairs
Also each data-point/values contained within the multiple dataset(s) gives us valuable insights.
As the collation of such data is from various angles and perspectives, can we apply Dimensionality Reduction Techniques on such a dataset(s)?
If the answer/reply is in positive, another counter question arises—
Won’t the data values/data points contained in such vast datasets across multiple features be compromised while applying Dimensionality Reduction ML technique?
Let me know your views and inputs for multiple queries raised under this topic.