Let <mo fence="false" stretchy="false">{ X 1 </msub> , &#x2026;<!-- … -->

ttyme411gl

ttyme411gl

Answered question

2022-07-09

Let { X 1 , , X K } is a set of random matrices, where X k R M × N , k = 1 , , K, and U R M × r and V R N × r are two matrices containing orthogonal columns (i.e., U U = I , V V = I). I was wondering, if the following question has a analytical solution:
max U , V k = 1 K U X k V F 2
If not, how should I solve it? Alternating optimization?

(At first, I thought it may be related to the SVD of the sum of the matrices { X k }, but so far I have no hint to prove it.)

Answer & Explanation

Nicolas Calhoun

Nicolas Calhoun

Beginner2022-07-10Added 15 answers

maximize k = 1 K U X k V F 2 subject to U U = I V V = I
Let
f k ( U , V ) := U X k V F 2 = tr ( U X k V V X k U ) = tr ( V X k U U X k V )
Hence,
U f k ( U , V ) = 2 X k V V X k U
V f k ( U , V ) = 2 X k U U X k V
Let the Lagrangian be
L ( U , V , Λ 1 , Λ 2 ) := k = 1 K f k ( U , V ) Λ 1 , U U I Λ 2 , V V I
where the Lagrange multipliers Λ 1 and Λ 2 are symmetric matrices. Taking the partial derivatives with respect to U and V ,
U L ( U , V , Λ 1 , Λ 2 ) = 2 k = 1 K X k V V X k U 2 U Λ 1
V L ( U , V , Λ 1 , Λ 2 ) = 2 k = 1 K X k U U X k V 2 V Λ 2
Finding where the partial derivatives vanish, we obtain two cubic matrix equations in U , V , Λ 1 , Λ 2 and two quadratic matrix equations in U and V
k = 1 K X k V V X k U = U Λ 1 k = 1 K X k U U X k V = V Λ 2 U U = I V V = I
How can one solve these matrix equations? I do not know.
Ximena Skinner

Ximena Skinner

Beginner2022-07-11Added 7 answers

Define the variables
Y = U U T y = v e c ( Y ) Z = V V T z = v e c ( Z ) S = k X k X k
Note that ( Y , Z ) are not orthogonal but they are ortho-projectors
Y 2 = Y = Y T Z 2 = Z = Z T
Similarly, the vectors ( y , z ) are not unit vectors but they satisfy simple normalization conditions
y T y = M z T z = N
Write the objective function as
f = k U T X k V F 2 = k ( U T X k V ) : ( U T X k V ) = k ( Y X k ) : ( X k Z ) = k v e c ( Y X k ) T v e c ( X k Z ) = k y T ( X k I M ) ( I N X k ) z = y T S z
where colon denotes the inner/Frobenius product.

The main result of these manipulations was to recast the problem as a single matrix-vector equation. The vector gradients are then
f y = S z , f z = S T y
This still isn't a solution to the problem, just another way of thinking about it.

Do you have a similar question?

Recalculate according to your conditions!

New Questions in High school geometry

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?