Why was this definition of variance chosen?

Apr 2008
20
0
Why do we use:
(variance)=(Σ(the difference of each value from the mean)^2)/(number of values)

and do not use instead the:
(Σ(the absolute value of the difference of each value from the mean))/(number of values)?

I guess the because has to do with why the least squares method is the best? Why is the least squares method the best?

I guess the answer to both questions has to do with that the normal distribution has a standard deviation (and thus also variance) equal to 1. So?

Please answer as lucidly as possible (consider I have a low IQ). And explain every mathematical symbolism you use. I'd prefer if you explain it with words alone instead of math symbols.
 

Prove It

MHF Helper
Aug 2008
12,897
5,001
First, least-squares methods depends on variances and covariances, and are "the best" because least-squares ensures that the covariance is the smallest.

Most people only say they square each deviation to make all the deviations positive, to avoid any information being lost due to cancelling. The other reason, which is why we don't use absolute values, is because when deviations are small (and they should be), squared deviations become even smaller.
 

matheagle

MHF Hall of Honor
Feb 2009
2,763
1,146
Fisher wanted the absolute values.
The nicer thing about the square instead is that it's differentiable
Eventually Fisher lost out on the |.|
 
  • Like
Reactions: mr fantastic
Apr 2008
20
0
I'd bet that it has to do with that the normal distribution has a variance and a standard deviation equal to 1.
Least squares is better than the absolute values, in order to locate the most probably correct curve, YES OR NO? And if it is better, why?

Am I alone to get it all connected? Someone must know in here.
 
Last edited:
Oct 2009
340
140
I'd bet that it has to do with that the normal distribution has a variance and a standard deviation equal to 1.
Least squares is better than the absolute values, in order to locate the most probably correct curve, YES OR NO? And if it is better, why?

Am I alone to get it all connected? Someone must know in here.
Matheagle and I have already stated why you square the differences instead of take the absolute values: the function \(\displaystyle f(x) = x^2\) is nicer analytically than \(\displaystyle g(x) = |x|\). In particular, \(\displaystyle f\) is differentiable. If you don't see why this would be advantageous, then you would probably have to have a better background in mathematical statistics (or just mathematics in general) to understand a satisfactory answer to this question to begin with. The question is inherently mathematical. If you aren't sharp at math, there is no intuitive answer. It has nothing to do with the normal distribution in particular.

Least Squares Estimators are optimal in the following way: under the usual assumptions (normal distribution, independent identically distributed errors) the estimators of the regression coefficients are UMVUE's. If you drop the distributional assumption, they are BLUE's. The criterion we use to evaluate the "goodness" of unbiased estimators is the variance though, which is the issue you are asking about in the first place.
 
Apr 2008
20
0
Is it impossible to construct the equation of a curve (as the least squares construct it), using the average absolute deviation (or many absolute deviations)? Yes or no?

If it is possible, then it is a worse approximation of the real one, than the line or the curve constructed by the least squares? Yes or no?
 
Oct 2009
340
140
1) Yes, it is possible. In fact, I linked to the Wiki Page on this topic in my first post of this thread.

2) It depends on your definition of "worse".....
 
Last edited:

Prove It

MHF Helper
Aug 2008
12,897
5,001
Is it impossible to construct the equation of a curve (as the least squares construct it), using the average absolute deviation (or many absolute deviations)? Yes or no?

If it is possible, then it is a worse approximation of the real one, than the line or the curve constructed by the least squares? Yes or no?
The Gauss-Markov Theorem proves that the Least Squares Approximator is the best approximator, because the covariance matrix of a Least Squares approximation is always the smallest possible covariance matrix. So yes, most likely, your approximation will be worse.

I would suggest you read "MULTIPLE VARIABLE REGRESSION", an article I wrote for the Issue 1 of the Math Help Forum e-zine.

http://www.mathhelpforum.com/math-help/pre-prints-other-original-work/138041-mhfzine-issue-1-a.html
 
Apr 2008
20
0
It depends on your definition of "worse".....
By "worse" I mean: The constructed curve which is the worse approximation, has the worse-greater average absolute deviation from the real curve.
 
Last edited: