I am trying to find the minimizer of the function

 \left\|  \nabla f(x) + \lambda ^T \nabla h(x) + \mu ^T \right\|^2
s.t. \mu _i \geq 0 , \mu _i = 0 if x_i > 0

We use the function \phi _i (\mu ) = min \left\{ \mu _i , x_i  \right\}
we have that \mu ^T x = 0 \Leftrightarrow \phi ( \mu ) =0
So we can actually solve the problem

Minimize \left\|  \nabla f(x) + \lambda ^T \nabla h(x) + \mu ^T \right\|^2 + \left\| \phi (\mu )  \right\|^2
s.t. \lambda , \mu  \geq 0
Now my reasoning is, by letting g= \nabla f(x) + \lambda ^T \nabla h(x) , the problem becomes:

Minimize (g+ \mu ) ^2 +  \phi (\mu ) ^2 i.e. Minimize (g+ \mu ) ^2 +  min \left\{ \mu , x \right\}^2
or Minimize  (g^2+ 2g \mu +\mu ^2 ) +  \left\{ \mu ^2 \quad or \quad x^2  \right\}

Now, I think I should first find the critical points for the function  (g^2+ 2g \mu +\mu ^2 ) +  \left\{ \mu ^2 \quad or \quad x^2  \right\} . But, should I consider this function as a function of  \mu and g, or as a function of \mu and x ?

Another way of thinking about this problem. If I use the 'Fischer-Burmeister' function which is:

I am trying to find the minimizer of the function

\left\|  \nabla f(x) + \lambda ^T \nabla h(x) + \mu ^T \right\|^2
s.t. \mu _i \geq 0 , \mu _i = 0 if x_i > 0

We use the function \phi _i (\mu ) = min \left\{ \mu _i , x_i  \right\}
we have that \mu ^T x = 0 \Leftrightarrow \phi ( \mu ) =0

So we can actually solve the problem

Minimize \left\|  \nabla f(x) + \lambda ^T \nabla h(x) + \mu ^T \right\|^2 + [itex]\left\| \phi (\mu )  \right\|^2
s.t. \lambda , \mu  \geq 0
Now my reasoning is, by letting g= \nabla f(x) + \lambda ^T \nabla h(x) , the problem becomes:

Minimize  (g+ \mu ) ^2 +  \phi (\mu ) ^2 i.e. Minimize  (g+ \mu ) ^2 +  min \left\{ \mu , x \right\}^2
or Minimize   (g^2+ 2g \mu +\mu ^2 ) +  \left\{ \mu ^2 , x^2  \right\}

Now, I think I should first find the critical points for the function  (g^2+ 2g \mu +\mu ^2 ) +  \left\{ \mu ^2 , x^2  \right\} . But, should I consider this function as a function of  \mu and [itex]g[/TEX], or as a function of \mu and x ?

Another way of thinking about this problem. We can use the 'Fischer-Burmeister' function which is:

\Phi (\mu , x ) = \mu + x - \sqrt{\mu ^2 + x^2} instead of the function \phi _i (\mu ) = min \left\{ \mu _i , x_i  \right\} because for the 'Fischer-Burmeister' function,
\Phi (\mu , x ) = 0 \Leftrightarrow  \mu x =0 just like for the previous function \phi _i (\mu ) .

Now, the problem would be to

Minimize  (g+ \mu ) ^2 +  \Phi (\mu ) ^2 .

Again, how should I go about finding this minimum?