PCA vs Correlation What is the relationship between (first) principal component(s) and the correlation matrix or the average correlation of the data. For example, in an empirical application I observe that the average correlation is almost the same as the ratio of the variance of the first principal component (first eigenvalue) to the total variance (sum of all eigenvalues). Is there a mathematical relationship?

Shannon Andrews 2022-07-16 Answered
PCA vs Correlation
What is the relationship between (first) principal component(s) and the correlation matrix or the average correlation of the data. For example, in an empirical application I observe that the average correlation is almost the same as the ratio of the variance of the first principal component (first eigenvalue) to the total variance (sum of all eigenvalues).
Is there a mathematical relationship?
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

minotaurafe
Answered 2022-07-17 Author has 22 answers
Short Answer: The principal components are the eigenvectors of the correlation matrix. Therefore, each principle component (V) multiplied by the correlation matrix (C) will give us the same correlation matrix times the corresponding eigenvalue λ:
C V = λ V
Details: Given n-dimensional data x i R n , suppose we have m datapoints represented as rows in a matrix X (An m × n matrix). Given that Cor(i,j) is the correlation of 2 dimensions i and j, the correlation matrix is defined as:
C = [ C o r ( 0 , 0 ) C o r ( 0 , 1 ) C o r ( 0 , n 1 ) C o r ( 1 , 0 ) C o r ( 1 , 1 ) C o r ( 1 , n 1 ) C o r ( n 1 , n 1 ) ]
Since the correlation matrix is a square matrix of size n × n, there are n possible eigenvectors for this matrix, and these vectors are the principle components of this data. each principle component V is of size n × 1, and it's corresponding eigenvalue λ is a scalar value.
Not exactly what you’re looking for?
Ask My Question

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

You might be interested in

asked 2022-09-17
Conditional Probability and Independence nonsense in a problem
The statement:
Suppose that a patient tests positive for a disease affecting 1% of the population. For a patient who has the disease, there is a 95% chance of testing positive, and for a patient who doesn't has the disease, there is a 95% chance of testing negative. The patient gets a second, independent, test done, and again tests positive. Find the probability that the patient has the disease.
The problem:
I can solve this problem, but I'm unable to understand what is wrong with the following:
Let T i be the event that the patient tests positive in the i-th test, and let D be the event that the patient has the disease.
The problem says that P ( T 1 , T 2 ) = 0.95 2 0.01 + 0.05 2 0.99 = 0.0115, because the tests are independent.
By law of total probability we know that:
P ( T 1 , T 2 ) = 0.95 2 0.01 + 0.05 2 0.99 = 0.0115
Replacing, and assuming conditional independence given D, we have:
P ( T 1 , T 2 ) = 0.95 2 0.01 + 0.05 2 0.99 = 0.0115
This is the correct result, but now let's consider that:
P ( T 1 , T 2 ) = P ( T 1 ) 2
We know that P ( T 1 , T 2 ) = P ( T 1 ) 2 for all i because of symmetry, so we have P ( T 1 , T 2 ) = P ( T 1 ) 2 . Again, by law of total probability:
P ( T 1 ) = 0.95 0.01 + 0.05 0.99 0.059
P ( T 1 ) = 0.95 0.01 + 0.05 0.99 0.059
So we have:
P ( T 1 , T 2 ) = P ( T 1 ) 2 0.059 2 0.003481
The second approach is wrong, but it seems legitimate to me, and I'm unable to find what's wrong.
Thank's for your help, you make self studying easier.
asked 2022-08-18
Statistics:How to prove that one variable's change is influenced by another variable,that is to say ,they are related?
I major in Bioinformatics. Now,I am in a problem: we all know that temperature changes during a year , I find that a disease incidence is really high when temperature is relatively high , while it becomes really low when the temperature is relatively low , that is to say ,they are related. So ,I want to find out a way to prove that they are related ,not just intuitively feel that they are related. So, any suggestions?
asked 2022-08-31
Do future events influence the probability of past events?
This question came to mind when I was dealing with this question: A card from a pack of 52 cards is lost. From the remaining cards of the pack, two cards are drawn at random and are found to be clubs. Find the probability that the lost card is a club.
My query: The probability that a club is lost is 13/52. The next event of taking two cards from this incomplete set of cards happens after one of the cards is lost. How can this affect a past event of losing a card from the pack?
asked 2022-07-20
Correlation of Rolling Two Dice
If A is a random variable responsible for calculating the sum of two independent rolls of a die, and B is the result of calculating the value of first roll minus the value second roll, is is true that A and B have a c o v ( A , B ) 0? In other words, is it true that they are correlated?
I've come to the conclusion that they must be correlated because they are not independent, that is, the event of A can have an impact on event B, but I remain stuck due to the fact that causation does not necessarily imply correlation.
I know that independence −> uncorrelation, but that the opposite isn't true.
asked 2022-08-12
Advantages of Mathematics competition/olympiad students in Mathematical Research
Everyone in this community I think would be familiar with International Mathematical Olympiad, which is an International Mathematics Competition held for high school students, with many countries participating from around the world.
What's interesting to note is that many of the IMO participants have gone to win the Fields Medal. Notable personalities include Terence Tao (2006), Ngo Bao Chau (2010), Grigori Perelman (2006), etc.
I would like to know: What advantages does an IMO student possess over a 'normal' student in terms of mathematical research? Does the IMO competition help the student in becoming a good research mathematician or doesn't it?
asked 2022-08-13
If we have two non-zero correlated random variables then they are dependent.
Why then do we have the saying "Correlation does not imply Causation". A change in one variable may not cause exactly the same change in another but there is at least some 'causal' link.
asked 2022-07-20
1. In your own words, describe the difference between continuous and discrete data sets.
2. What is the difference between the independent and dependent variable? How do you determine which goes on which axis on your graph?
3. What are the four types of correlation?
4. How do they differ from one another?
5. In your own words, explain the difference between correlation and causation.

New questions

Euclid's view and Klein's view of Geometry and Associativity in Group
One common item in the have a look at of Euclidean geometry (Euclid's view) is "congruence" relation- specifically ""congruence of triangles"". We recognize that this congruence relation is an equivalence relation
Every triangle is congruent to itself
If triangle T 1 is congruent to triangle T 2 then T 2 is congruent to T 1 .
If T 1 is congruent to T 2 and T 2 is congruent to T 3 , then T 1 is congruent to T 3 .
This congruence relation (from Euclid's view) can be translated right into a relation coming from "organizations". allow I s o ( R 2 ) denote the set of all isometries of Euclidean plan (=distance maintaining maps from plane to itself). Then the above family members may be understood from Klein's view as:
∃ an identity element in I s o ( R 2 ) which takes every triangle to itself.
If g I s o ( R 2 ) is an element taking triangle T 1 to T 2 , then g 1 I s o ( R 2 ) which takes T 2 to T 1 .
If g I s o ( R 2 ) takes T 1 to T 2 and g I s o ( R 2 ) takes T 2 to T 3 then h g I s o ( R 2 ) which takes T 1 to T 3 .
One can see that in Klein's view, three axioms in the definition of group appear. But in the definition of "Group" there is "associativity", which is not needed in above formulation of Euclids view to Kleins view of grometry.
Question: What is the reason of introducing associativity in the definition of group? If we look geometry from Klein's view, does "associativity" of group puts restriction on geometry?