How do I use a numerical method to minimize the sum of squares when my expression has several variables?
The examples of for instance Newton Raphson which I can find, only use expressions with one single unknown. It's funny that school books end there. The principle is explained, yes, but one is left short of any practically useful knowledge.
Btw, my problem is to fit the parameters of an ellipse to some points given.
Look at this link for Gradient Decent methods. If you follow the relevant
link you will see that when used for the minimisation of functions of a vector
variable that Newton's method is one technique that can be used.
RonL
Okey, here comes the extended question:
I want to choose the parameters of an ellipse so that it fits some given points the best. An algebraic solution is far too complex (involving 1000s of terms), so a numerical solution is the way to go.
There are several ways to parameterize an ellipse, lets look at this one:
x = ox + a*Cos(t)*Cos(T) + b*Sin(t)*Sin(T)
y = oy - a*Cos(t)*Sin(T) + b*Sin(t)*Cos(T)
where:
{x;y} is a point on the perimeter of the ellipse.
{ox;oy} is the origo, or center, of the ellipse.
a is the semi-major axis.
b is the semi-minor axis (which is major and minor doesn't matter, I think).
T is the tilt of the ellipse, the angle of the a-axis to the x-axis, say.
t is, well, it is not really a parameter but an angle 0 to 2pi which kind of represents each point on the ellipse. t=0 and t=pi represents two opposite points on the ellipse. I can't explain it well, but it doesn't matter.
Now I have the coordinated of a number of observed points {x1;y1},{x2;y2}...{xn;yn} which are supposed to lie close to the edge of an ellipse. So I want to choose the parameters ox, oy, a, b and T so that the sum of squared differences between the observed points and the closest calculated points is minimized.
First, I will choose a set of values for the parameters ox, oy, a, b and T (I will have good first guess).
Then I will choose one of the observed points {x1;y1}.
Then I will search for the value for t, 0<t<2pi, which minimizes:
SQRT( (x-x1)^2 + (y-y1)^2 ).
Then I will do the same with the next point {x2;y2}.
When this has been done for all n points, I will calculate the sum of the squared distances for all points for the set of parameter values used.
Then it is time to change the parameters ox, oy, a, b and T and do it all over again. And it is here I need some help.
If I had only one parameter, I could easily use for example Newton Raphson. But now that I need 5 parameters, how should I do?
I can formulate any necessary derivative, no problem.
I hope I have made an intelligable post here, and that the thing with the t-value and the extra minimization which it requires, hasn't obscured my real problem, which has to do with how to apply numerical solution methods to an expression with more than one unknown.
Don't do this yourself unless you are a competent programmer in a suitable
programming system (Matlab, Euler S/R), and even then you will be using the
library optimisation functions.
If you have access to a computer with MS Office that has been installed
properly used the Excel Solver its worth its weight in Gold.
RonL
RonL
Non-numerical method: Plot a scatter diagram of the points - they should
fall close to an ellipse if the data is reasonably good. Sketch in the ellipse
and take measurements of the relevant parameters.
In fact even if you intend to pursue a numerical solution you should start
with the scatter diagram just to check how adequatly the data is represented
by an ellipse.
RonL
Okey, hehe, from the last two posts here, I understand that you guys think that this is a bit too difficult for me!
But I will learn! Because I will have more use of solutions to this kind of problem, numerical methods for fitting geometric shapes to some given points. Although I'm rusty at both math and programming, I am actually getting into computer vision! It's one of the next multi-billion dollar business mega hypes, I bet, because with todays camera technology and computation capacity, everything is possible. Yet almost no applications exist. I don't know anyone who has a computer vision gadget of any kind, and I donät think I've ever seen one in a shop. And since I've missed all the other IT-hype trains, I'm jumping on this one. Anyway, it can be a quite rewarding mistake to make...
Thank you for this link, muh better than what I've found. I have taken a look at it and as you say, Newtons method seems to be suitable.
As I understand it, the iterations progress simply by subtracting the "inversion of the Hessian" divided by the "Gradient" from the result of the last iteration (or the initial guess).Originally Posted by Wikipedia
So the gradient is "a column vector whose components are the partial derivatives of the function".
And the Hessian is "the square matrix of second partial derivatives of a scalar-valued function".
I know how to do partial derivatives and some matrix stuff, so this should be a piece of cake! I'll start with that and come back when I get stuck...
But what does it mean in the quote above, to "replace the reciprocal of the second derivative"? In Newtons method with one dimension, there is the second derivative itself, not its reciprocal. So what is meant?
Okey, so I'm stuck again!
Inversing the Hessian took only 10 minutes for Mathematica on my PC. But the result takes like 10 pages to show (I don't know exactly because Mathematica crashed when I tried to copy that huge cell). (And then I haven't yet multiplied it with the gradient vector).
So, Newtons method is practically impossible. Actually it seems to be computationally worse than the enormous algebraic solution to the problem. How come? Is it mathematically impossible to fit an ellipse to some points? Both analysis and numerical solutions fail miserably.
Now, should I try with the Gauss-Newton algorithm or Levenberg-Marquardt algorithm, or will they give similar mega-output? Maybe the computationally most efficient way to solve the problem is use a kind of Monte-Carlo simulation and randomly select values for the parameters (within reasonable limits) and pick the one with the least error? Doesn't there exist any intelligent mathematical methods for problems with more than like 4 variables?
Choose any direction represented by the unit vector u, then if we put
x=au where a is a scalar, then we can minimise f(x)=f(au), which is a
1-D minimisation problem with variable a.
The trick is to choose u at each stage in a appropriate manner and we have
a method which uses no matrix inversion.
Say x_n is our currect estimate of the minimum, put u=grad f (x_n) (we don't
actualy have to worry about normalising this to a unit vector but you can if you
want), and then our next estimate is x_{n+1} the x giving minimum for this 1-D
problem (obtained by bog standard Newton-Raphson if you like).
RonL
Yes, when the symbolic computation is so impractical. Then I need code for computing the Hessian, but I'm sure that is easy to find ready-made. Btw, I've found an implementation in java of the Levenberg-Marquardt algorithm, which seems to be related with Newtons method and Gradient Descent.
I'm writing a demo prototype in java for simplicity. Later a professional programmer friend will write some realtime c++ code, if I can convince him that the idea is practical. It's in that stage computation time becomes important.
So how does one optimize the computation? Are there any tricks, or is it just a question of identifying repetitions in the expressions? (I suppose that the compilator doesn't do that spontaneously, no it isn't christmas yet...)
The trigonometric operations needed are quite few, almost only Cos or Sin of one single parameter. Those 10 values can be computed once for each set of parameters. Also, I note that all the first derivatives have the same denominator, and that expression occurs many times in the second derivatives too. It needs to be computed only once. Those two simple measures should reduce the computation time substantially.
Thanx for the inputs, everybody!