Let X be a m &#x00D7;<!-- × --> n (m: number of records, and n: number of attribut

excluderho

excluderho

Answered question

2022-06-15

Let X be a m × n (m: number of records, and n: number of attributes) normalized dataset (between 0 and 1). Denote Y = X R, where R is an n × p matrix, and p < n. I understand if R was drawn randomly from Gaussian distribution, e.g., N ( 0 , 1 ) then the transformation preserve the Euclidean distances between instances (all of the pairwise distances between the points in the feature space will be preserved). But what if R U ( 0 , 1 ), does the transformation still preserve the distance between instances?

Answer & Explanation

Nola Rivera

Nola Rivera

Beginner2022-06-16Added 21 answers

Suppose we have m records ( X i : 1 i m ) of normalized n-dimensional data (that is, for each i m, we can write X i = ( x 1 ( i ) , , x n ( i ) ) [ 0 , 1 ] n ). Then, multiplying any of the X i by a n × p matrix R (where p < n) can be seen as projecting the vector X i onto R p . Upon reading your question, I understand that you're interested in projections that can be seen as somewhat faithful, in that there will be little difference between the Euclidean distance between vectors X i X j 2 in R n and their projection ( X i X j ) R 2 on R p .
First, I believe your claim that if the entries of R are i.i.d. N ( 0 , 1 ) random variables, then the transformation preserves the distance is false. To see this, consider the simple counterexample: Suppose we have two instances with two attributes X 1 = [ 1 , 1 ] and X 2 = [ 0 , 1 ]. Then, we have that
X 1 X 2 2 = 1 2 + 0 = 1.
Let R = [ R 1 , R 2 ] be our transformation. Then, if we want the norm to be preserved, it is necessary that
1 = | X 1 R X 2 R | = | ( R 1 + R 2 ) R 2 | = | R 1 | ,
in which case R 1 can only take the values -1 and 1, and hence is not N ( 0 , 1 ).

Do you have a similar question?

Recalculate according to your conditions!

New Questions in Linear algebra

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?