Let me rephrase the question:
The best approximation curve (=the cunstructed curve which is closest to the real but unknown curve), is the constructed curve B where the data have the least variance from curve B (least squares method), and not the constructed curve A where the data have the least average absolute deviation from curve A?
If THAT is proven, the proof must have something to do with the normal distribution. And the reason we use the given definition of variance is related to that proof and the normal distribution.
If THAT is false, then why do we use the least squares method and not the other one? Only because of the "possibly multiple solutions":
Least absolute deviations - Wikipedia, the free encyclopedia. Then that is the only reason the least squares is chosen, or the previous "THAT" also applies?
And the above has nothing to do with the reason we use the given definition of variance?