I am trying to regress y=3x with one bad data point.
x = 0,2,4,6,8 y=0,6,12,18,8
The linear regression is y=1.4x + 3.2 and it properly minimizes the least square error, but I would like to find a regression that better fits the real data and gives less emphasis to the bad data (x=8). In the real world, it is not known which is the bad data, so weighting is not an acceptable option.
If I look at the area between the curves, that area appears to be minimized by fitting through the correct data which is the answer I would be looking for.
Can anyone refer me to regregression equations (and derivation would be great) that minimizes the area between a linear fit and the experimental data (x independent)? I want to learn how to do this, so a MatLab solution does not help.
Thanks for your time!