Clustering data using mixed integer linear programmingI am trying to understand if it is possible to use mixed integer linear programming (MILP) in order to perform a basic clustering operation to a dataset D. I know there exists standard algorithms for it but I am particularly interested in MILP.Assume one has the dataset D = D 1 ∪ D 2 with: D 1 = [ 1 , 2 , 2 , 3 , 4 , 3 , 1 , 0 , 2 , 2 , 1 ] D 2 = [ 7 , 6 , 5 , 8 , 7 , 7 , 4 , 7 , 9 , 8 , 6 ]Is there some MILP algorithm that can cluster the concatenated and shuffled data SD, with S some permutation acting on D, which can cluster each element of D to either D 1 or D 2 ?

Question

Clustering data using mixed integer linear programmingI am trying to understand if it is possible to use mixed integer linear programming (MILP) in order to perform a basic clustering operation to a dataset D. I know there exists standard algorithms for it but I am particularly interested in MILP.Assume one has the dataset   D  =      D    1    ∪      D    2   with:      D    1    =  [  1  ,  2  ,  2  ,  3  ,  4  ,  3  ,  1  ,  0  ,  2  ,  2  ,  1  ]        D    2    =  [  7  ,  6  ,  5  ,  8  ,  7  ,  7  ,  4  ,  7  ,  9  ,  8  ,  6  ]Is there some MILP algorithm that can cluster the concatenated and shuffled data SD, with S some permutation acting on D, which can cluster each element of D to either       D    1   or       D    2  ?

Dobermann82 · Accepted Answer

Step 1Let continuous variables       x    1   and       x    2   represent the centers of the two clusters. Let binary variable       y    i   indicate whether element index i (with value       d    i  ) is in       D    1  . Now minimize       ∑    i        |        d    i    −      c    1        |        y    i    +      ∑    i        |        d    i    −      c    2        |    (  1  −      y    i    )  ,, which you can linearize as follows. Introduce nonnegative continuous variables       z          i      ,      1       and       z          i      ,      2       to represent the summands, and enforce       y    i    =  1    ⟹        z          i      ,      1        ≥      |        d    i    −      c    1        |   and       y    i    =  0    ⟹        z          i      ,      1        ≥      |        d    i    −      c    2        |  .Step 2The problem is to minimize       ∑    i        z          i      ,      1        +      ∑    i        z          i      ,      2      subject to linear big-M constraints                    (                  d          i                −                  c          1                )        −                  z                      i            ,            1                                              ≤                  M          1                (        1        −                  y          i                )                            −        (                  d          i                −                  c          1                )        −                  z                      i            ,            1                                              ≤                  M          1                (        1        −                  y          i                )                            (                  d          i                −                  c          2                )        −                  z                      i            ,            2                                              ≤                  M          2                          y          i                                    −        (                  d          i                −                  c          2                )        −                  z                      i            ,            2                                              ≤                  M          2                          y          i

Clustering data using mixed integer linear programming I am trying to understand if it is possible

Answered question

Answer & Explanation

New Questions in Discrete math