Why is it that when doing linear regression on a set of data points, you find the equation of the line such that y = mx + c minimises the squared differences? To me it makes sense that you would want to min the difference, not the squared difference.

What advantage does the using squared differences have over just the difference and why is it done this way?