"A common theme in linear compression and feature extraction is to map a high dimensional vector

Flakqfqbq 2022-05-07 Answered
"A common theme in linear compression and feature extraction is to map a high dimensional vector x to a lower dimensional vector y = W x such that the information in the vector x is maximally preserved in y. Opten PCA is applied for this purpose. However, the optimal setting for W is in generall not given by the widely used PCA. Actually, PCA is sub-optimal special case of mutual information maximisation."

Can anyone elaborate why PCA is a sub-optimal special case of mutual information maximisation ?
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

Answered 2022-05-08 Author has 21 answers
An easy to explain example is if you have two sets of functions very corrupted by high energy noise. You want to find what parts / subsets / linear combinations of these correspond to each other the most.

If we just go for PCA it will optimize subspaces looking for dimensions of highest L2 norm in different senses, but if our noise has higher L2-norm than functions of interest it will rather select noise than functions of interest! And we know that independently sampled uncorrelated noise will have very low mutual information with just about anything deterministic of interest.

Therefore we will do better if we search for a method which does not focus so much on norm of actual signal/function but on some statistical correspondence like... for example, cross correlation or covariance.
Not exactly what you’re looking for?
Ask My Question

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Relevant Questions

asked 2022-05-09
A huge conical tank to be made from a circular piece of sheet metal of radius 10m by cutting out a sector with vertex angle theta and welding the straight edges of the straight edges of the remaining piece together. Find theta so that the resulting cone has the largest possible volume.

Specifically, the question is asked in the context of wanting derivatives, multiple max/min equations, and hopefully more calc rather than trig or geo.

I have gotten as far as using 10m as the hypotenuse for a triangle formed by the height of the cone, radius of the base of the cone, and slant. I'm not sure where to go from there, because I can't determine how to find height and/or radius, without which I'm not sure I can continue.
asked 2022-05-03
I have non-negative function g ( y , x ) that is define using non-negative f ( y , x ) in the following way:
g ( y , x ) = 0 x f ( t , y ) d t
I am trying to maximize max y S g ( y , x ). Using Fatou's lemma we have that
max y S g ( y , x ) = 0 x f ( t , y ) d t 0 x max y S f ( t , y ) d t
I also have that max y S f ( t , y ) h ( t ) where 0 x h ( t ) <

My question when does the last inequality hold with equally?

What do I have to assume about f ( y , x ) and g ( x , y ) for equality to hold?

Can I apply dominated convergence theorem here?
asked 2022-05-14
I have a non-linear maximization problem and I want to convert it to be a minimization problem, can I do so by multiplying it by a negative sign, or is that wrong; and if that is wrong what should I do?
asked 2022-05-07
I have to maximize the following function -
Max A C 1 m / m + (1-A) C 2 m / m
Subject to,
C 1 ≤ 5(1-x) + x
C 2 ≤ 3(1-x) + 7x
I wrote it as: L(x) = f(x) - λ 1 ( C 1 - 5(1-x) + x) - λ 2 ( C 2 - 3(1-x) + 7x) - λ 3 (x-1) - λ 4 (x-10)
Can I write last constraint as partioned into λ 3 and λ 4 . Is there some other way to introduce such box constraints into the same problem?
asked 2022-04-07
I am trying to solve the following question:
Maximize  f ( x 1 , x 2 , , x n ) = 2 i = 1 n x i t A x i + i = 1 n 1 j > i n ( x i t A x j + x j t A x i )
subject to
x i t x j = { 1   i = j 1 n   i j
where xi's are column vectors ( m × 1 matrices) with real entries and A is an m × m ( ( n < m )) real symmetric matrix.

From some source, I know the answer as
f max = n + 1 n i = 1 n λ i ,
λ i being the eigenvalues of A sorted in non-increasing order (counting multiplicity). But I am unable to prove it. I will appreciate any help (preferably with established matrix inequality, or Lagrange's multiplier method).
asked 2022-05-10
Consider that I have two items. Let P i ( x i ) denote the profit I get from item i by taking x i units of it (assume that x i can be a real number). Let W i ( x i ) denote the weight consumed by item i when I take x i units of this. Assume that both profit and weight functions are monotonically increasing in the quantity of items picked. The problem I am interested in is
max x 1 , x 2   P 1 ( x 1 ) + P 2 ( x 2 ) s . t .       W 1 ( x 1 ) + W 2 ( x 2 ) W
where W is total weight allowed. Thus I need to find x i (quantity of item i) such that the total profit is maximized while keeping the total weight under a given quantity. Let x 1 and x 2 be the solution to this problem. Is it true that
P 1 ( x 1 ) W 1 ( x 1 ) = P 2 ( x 2 ) W 2 ( x 2 )
Note that ratio is essentially "Profit Per unit weight of an item". Assume that this were different at the optimum and the first item had the larger ratio. Thus I can get more profit per unit weight out of the first item. Then increasing quantity of that should increase profit. In order to balance the weights, I decrease the second items quantity decreasing its profit as well. But since the increase in profits from item 1 is higher, this should compensate for loss from item 2 and make the profits even more higher. What is flawed with this logic?
asked 2022-05-08
Let be t R , n = 1 , 2 , , p [ 0 , 1 ] and a ( p , 1 ].

Show that
sup t ( t a log ( p e t + ( 1 p ) ) = a log ( a p ) + ( 1 a ) log ( 1 a 1 p )
So, I've try to derivate, but I did'nt get sucess, since my result is different. I've get log ( a ( 1 p ) p ( 1 a ) ) . Any ideas?