Big O - Explanation needed

I'm reading through a discrete math book and it's talking about Big O notation. The concept makes sense to me, but I don't understand how they're applying it. They give examples and state things to be true and don't explain it.

An example they give is

"Show that f(x) = x^2 + 2x + 1 is O(x^2)"

This is saying.. what, exactly? Show that the first equation is related to x^2 by use of witnesses? .. I don't even know what that means.

"Solution: We observe that we can readily estimate the size of f(x) when x>1 because x<x^2 and 1<x^2. It follows that

0 <= x^2 + 2x + 1 <= x^2 + 2x + x^2 = 4x^2

whenever x > 1."

I have no idea what they did here or how they came up with "4x^2". Why would they replace 1 with x^2?

Any information would be appreciated.

Thanks