I am reading about PCA and found an exercise that says Show that when a N-dim set of data points X

rmd1228887e 2022-07-06 Answered
I am reading about PCA and found an exercise that says
Show that when a N-dim set of data points X is projected onto the eigenvectors V = [ e 1 e 2 . . . e n ] of its covariance matrix C = X X T , the covariance matrix of the projected data C p = Y Y T is diagonal and hence that, in the space of the eigenvector decomposition, the distribution of X is uncorrelated.
What I have so far is
Y = V T X
Therefore
C p = Y Y T = V T X ( V T X ) T = V T X X T V
but there I got stuck. Any advise on how to proceed, moreover, what does "The covariance matrix of the projected data is diagonal" mean?
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

Hayley Mccarthy
Answered 2022-07-07 Author has 19 answers
A diagonal matrix is one with zero everywhere and the diagonal entries can be zero or non zero.
Assuming the eigenvectors are normalized in magnitude e i = 1
  V = [ e 1 e 2 e n ]  and  C e i = λ i e i   C V = V [ λ 1 0 0 0 λ 2 0 0 0 λ n ]   C p = V T ( X X T ) V = V T C V   C p = V T V [ λ 1 0 0 0 λ 2 0 0 0 λ n ]   V T V = I  as eigenvecotrs of a symmetric matrix are orthogonal   C p = [ λ 1 0 0 0 λ 2 0 0 0 λ n ] A diagonal matrix with variance in each eigenvector direction
Not exactly what you’re looking for?
Ask My Question

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

You might be interested in

asked 2021-02-23
Interpreting z-scores: Complete the following statements using your knowledge about z-scores.
a. If the data is weight, the z-score for someone who is overweight would be
-positive
-negative
-zero
b. If the data is IQ test scores, an individual with a negative z-score would have a
-high IQ
-low IQ
-average IQ
c. If the data is time spent watching TV, an individual with a z-score of zero would
-watch very little TV
-watch a lot of TV
-watch the average amount of TV
d. If the data is annual salary in the U.S and the population is all legally employed people in the U.S., the z-scores of people who make minimum wage would be
-positive
-negative
-zero
asked 2022-06-14
Numerical (Second) Derivative of Time Series Data
First and second order derivatives are often used in chromatography to detect hidden peaks. The time series data consists of Instrumental Response vs. Time at very short time intervals (250 Hz). I wanted to calculate the second derivative of the data numerically in Excel. The simple option is that we calculate the first derivative and then calculate the first derivative of the first derivative to get the second derivative. The other option is to use the direct approach using central difference formula for the second derivative. The question is about the denominator of the second derivative from the central difference formula. It should the square of the time interval. This is my understanding and it is consistent dimensionally for example distance x (m) becomes acceleration (m/s2) as the second derivative of x.
A reviewer wrote a rather denigrating comment saying that there is a lack of understanding of the second derivative "definition" where the authors assert that the definition of a second derivative requires division by the square of the time interval. This reference to the square of a time interval suggests a worrying lack of understanding of the nature of the derivative d 2 d t 2 as an operator and not as an algebraic variable. Do mathematicians agree with the above comment? Can we interpret d 2 d t 2 as if it is repeating the d operator twice divided by time interval squared? Thanks.
asked 2021-03-07
Dani says she is thinking of a secret number. As a clue, she says the number is the least whole number that has three different prime factors. What is Dani's secret number? What is its prime factorization?
asked 2022-07-09
Friesen and Shine (2019) wanted to determine whether male Australian cane toads have different testes sizes in different parts of the species' range (edge of the range vs. core of the range). As part of the study, they needed to quantify how big a toad's testes are relative to the toad's body size. They decided to perform a linear regression of total testes mass (in mg) against body mass (in g) and use the residual for each toad as a measure of the toad's relative testes size. The Coefficient Estimates table for their least-squares regression procedure is shown below.
Term Coefficient Standart Error t v a l u e P r >∣ t (intercept) 19.192 43.082 0.44547 0.65643 body mass 3.0063 0.36387 8.262 1.5733 e 14
One toad in the dataset had a body mass of 128g and a total tested mass of 163 mg. Compute the residual corresponding to this toad and write a sentence interpreting the residual.
asked 2022-07-08
Benefits of factoring matrix multiplication into two matrix multiplications?
Assuming we have a linear transformation for vectors R n to vectors R m denoted y=Wx and that the goal is to learn the values of W (or at least the values that produce minimum loss) from data.
My question is as follows:
Is there any benefit from splitting the linear transformation to two matrices and learning the factors instead of learning the original one, i.e., if W=AB, then learn A and B instead of learning W? Are there any mathematical properties of such factoring that make it useful?
Unfortunately, I am unable to find a mathematical term for this matrix factorization (if this is a proper term to use), so it would be great if someone can tell me what should I be reading about.
Also, the only benefit I could find is that if A R m × q , B R q × n and q<n, the number of parameters needed would be (n+m) ×q instead of n ×m.
asked 2022-06-21
I have an algorithm for calibrating a vector magnetometer. The input is N readings of the x, y, z axes: ( x 1 , x 2 , , x N ), ( y 1 , y 2 , , y n ), and ( z 1 , z 2 , , z N ).
The algorithm fits an ellipsoid to the data by estimating a symmetric 3 × 3 matrix A. In order to calibrate the system, it needs to calculate A . I am adapting the algorithm for a microcontroller with very little memory, so cannot load standard matrix manipulation libraries.
Is there an explicit formula for calculating the square root of 3 × 3 positive definite matrix?
asked 2022-07-01
Roadmap for learning Topological Data Analysis?I'm a math major who has recently graduated and I will be starting full time work in 'data analysis'.Having finished with decent marks and still being incredibly interested in mathematics, I was thinking of pursuing graduate study/research at some point in the future. I was reading up about possible areas of study for this when I came across topological data analysis, which (as I understand it) is an application of algebraic topology to data analysis.Given my situation, I was intrigued by the concept and I would like to do some self study so I can have a working understanding of the subject. I have only done basic undergraduate abstract algebra, analysis and point set topology, and I am currently reading Munkres' Topology (Chapter 9 onwards). How do I get from where I am now to understanding the theory behind TDA and being able to apply it?My knowledge on further mathematics is far from extensive and I would appreciate any advice on links/texts which I could use to learn the relevant material.