See this thread.
See this thread.
Yes, that is what it means; and no, for this proof it's not important to take the positive square root of A. In general, a positive definite matrix has many square roots (but only one of them is positive definite), and for this proof any square root would do.
The only way I know to prove that is to use the norm of an operator. Define , where . Then and . It follows that
.