Big-O Notation in Geometric Interpretation of Derivative

I just watched the first lecture of MIT 18.01 Single Variable Calculus, Fall 2006, and in that video, the lecturer explained about the geometric interpretation of derivative. A derivative of a function $\displaystyle f(x)$ can be defined as:

$\displaystyle f'(x) = \lim_{x \to \Delta x} \frac{f(x+\Delta x) - f(x)}{\Delta x}$

The lecturer then used this to find the derivative of the function $\displaystyle f(x) = x^n$, and this is where it starts to get confusing for me. Using the definition above, the derivate of the function $\displaystyle f(x) = x^n$ is equivalent to:

$\displaystyle f'(x) = \lim_{x \to \Delta x} \frac{(x+\Delta x)^n - x^n}{\Delta x}$

The part which confuses me is when the term $\displaystyle (x+\Delta x)^n$ is evaluated using the Binomial Theorem:

$\displaystyle (x+\Delta x)^n = x^n + n(\Delta x)x^{n-1}+\mathit{O}((\Delta x)^2)$

Why are the rest of the terms replaced by the Big-O Notation $\displaystyle \mathit{O}((\Delta x)^2)$? And why the term $\displaystyle (\Delta x)^2$ specifically?

Thanks! (Happy)