Let { X 1 , … , X K } is a set of random matrices, where X k ∈ R M × N , k = 1 , … , K, and U ∈ R M × r and V ∈ R N × r are two matrices containing orthogonal columns (i.e., U ⊤ U = I , V ⊤ V = I). I was wondering, if the following question has a analytical solution: max U , V ∑ k = 1 K ‖ U ⊤ X k V ‖ F 2 If not, how should I solve it? Alternating optimization?(At first, I thought it may be related to the SVD of the sum of the matrices { X k }, but so far I have no hint to prove it.)

Question

Let   {      X    1    ,  …  ,      X    K    } is a set of random matrices, where       X    k    ∈            R              M      ×      N        ,  k  =  1  ,  …  ,  K, and   U  ∈            R              M      ×      r       and   V  ∈            R              N      ×      r       are two matrices containing orthogonal columns (i.e.,       U    ⊤    U  =  I  ,      V    ⊤    V  =  I). I was wondering, if the following question has a analytical solution:            max              U        ,        V                    ∑              k        =        1            K        ‖          U      ⊤              X      k        V          ‖      F      2      If not, how should I solve it? Alternating optimization?(At first, I thought it may be related to the SVD of the sum of the matrices   {      X    k    }, but so far I have no hint to prove it.)

Nicolas Calhoun · Accepted Answer

maximize                                          ∑                          k              =              1                                      K                                ‖                                    U                                      ⊤                                                          X                        k                                V                                ‖                          F                        2                                              subject to                                          U                                ⊤                                    U                =                  I                                                                V                                ⊤                                    V                =                  I                    Let      f    k    (      U    ,      V    )  :=  ‖            U              ⊤                  X        k        V        ‖          F        2    =      tr        (                  U                    ⊤                            X            k              V                      V                    ⊤                            X            k              ⊤                    U        )    =      tr        (                  V                    ⊤                            X            k              ⊤                    U                      U                    ⊤                            X            k              V        )  Hence,      ∂                  U                    f    k    (      U    ,      V    )  =  2              X        k        V              V              ⊤                  X        k          ⊤            U        ∂                  V                    f    k    (      U    ,      V    )  =  2              X        k          ⊤            U              U              ⊤                  X        k        V  Let the Lagrangian be      L    (      U    ,      V    ,      Λ    1    ,      Λ    2    )  :=      ∑          k      =      1              K            f    k    (      U    ,      V    )  −  ⟨      Λ    1    ,            U              ⊤            U    −      I    ⟩  −  ⟨      Λ    2    ,            V              ⊤            V    −      I    ⟩where the Lagrange multipliers       Λ    1   and       Λ    2   are symmetric matrices. Taking the partial derivatives with respect to       U   and       V  ,      ∂                  U                  L    (      U    ,      V    ,      Λ    1    ,      Λ    2    )  =  2      ∑          k      =      1              K                  X        k        V              V              ⊤                  X        k          ⊤            U    −  2      U        Λ    1        ∂                  V                  L    (      U    ,      V    ,      Λ    1    ,      Λ    2    )  =  2      ∑          k      =      1              K                  X        k          ⊤            U              U              ⊤                  X        k        V    −  2      V        Λ    2  Finding where the partial derivatives vanish, we obtain two cubic matrix equations in       U    ,      V    ,      Λ    1    ,      Λ    2   and two quadratic matrix equations in       U   and       V                                                                                                                ∑                                          k                      =                      1                                                              K                                                                                                  X                                        k                                                        V                                                                              V                                                              ⊤                                                                                                  X                                        k                                          ⊤                                                                            U                                                                              =                                  U                                                  Λ                  1                                                                                                                          ∑                                          k                      =                      1                                                              K                                                                                                  X                                        k                                          ⊤                                                                            U                                                                              U                                                              ⊤                                                                                                  X                                        k                                                        V                                                                              =                                  V                                                  Λ                  2                                                                                                                          U                                                        ⊤                                                                    U                                                            =                                  I                                                                                                                          V                                                        ⊤                                                                    V                                                            =                                  I                                                                        How can one solve these matrix equations? I do not know.

Ximena Skinner · Accepted Answer

Define the variables                    Y                    =        U                  U          T                                    ⟹                y        =                  v          e          c                (        Y        )                            Z                    =        V                  V          T                                    ⟹                z        =                  v          e          c                (        Z        )                            S                    =                  ∑          k                          X          k                ⊗                  X          k                    Note that   (  Y  ,  Z  ) are not orthogonal but they are ortho-projectors                              Y          2                            =        Y        =                  Y          T                                              Z          2                            =        Z        =                  Z          T                    Similarly, the vectors   (  y  ,  z  ) are not unit vectors but they satisfy simple normalization conditions                              y          T                y                    =        M                                      z          T                z                    =        N            Write the objective function as                    f                    =                  ∑          k                        ‖                  U          T                          X          k                V                  ‖          F          2                                          =                  ∑          k                (                  U          T                          X          k                V        )        :        (                  U          T                          X          k                V        )                                  =                  ∑          k                (        Y                  X          k                )        :        (                  X          k                Z        )                                  =                  ∑          k                          v          e          c                (        Y                  X          k                          )          T                          v          e          c                (                  X          k                Z        )                                  =                  ∑          k                          y          T                (                  X          k                ⊗                  I          M                )        (                  I          N                ⊗                  X          k                )        z                                  =                  y          T                S        z            where colon denotes the inner/Frobenius product.The main result of these manipulations was to recast the problem as a single matrix-vector equation. The vector gradients are then                                          ∂            f                                ∂            y                                      =        S        z        ,                                                                      ∂            f                                ∂            z                                      =                  S          T                y                          This still isn&#039;t a solution to the problem, just another way of thinking about it.

Let <mo fence="false" stretchy="false">{ X 1 </msub> , …

Answered question

Answer & Explanation

New Questions in High school geometry