Several reports stated that it's common to use the square root of the Hessian since its condition number (ratio of largest eigenvalue to smallest eigenvalue) is less severe.
Specifically, assume V is the covariance matrix, E its eigenvector matrix, and D the diagonal square root matrix of eigenvalues (on the diagonal), the update equation is
x_{i+1} = x_i + EDZ'
where Z is a matrix of independent rows(columns) of random standard normal variates. The other approach I have seen is to use
x_{i+1} = x_i + Sqrt(V)Z'
Is there a definition or theorem that would explain use of such square root matrices for quasi-Newton methods? What would the Z' matrix do to the step direction?