Re: Regression and Maximums
Quote:
Originally Posted by
Eric1967
My problem is that: a) the maximums drift from one X to another one next to it, and b) I do not obtain as many maximums as what I was expecting
I am presently using a function like this one: W = A + B(z1) + C(z1^2) + D(z1^3) + E(z2) + F(z2^2) + G(z2^3) + ...
Should I use another function type ?
Hi !
This probably means that there are a lot of solutions, each one giving almost the same mean least squares value (i.e. a lot of different equations are likely to fit the data with the same accuracy).
This often occurs when there is a too large number of parameters to be optimized compared to the number of experimental values as given data.
For example, with W = A + B(z1) + C(z1^2) + D(z1^3) + E(z2) + F(z2^2) + G(z2^3), you have 7 parameters to optimize, wich is a rather big number. If the expected shape is not simple (i.e. with several maximums and minimums) and/or if the data is scattered, a non ambiguous fitting will require an anormous experimental data.
If the available data is not large enough, it is necessary to have a smaller number of parameters. This will not be possible with so elementary functions as linerar, square, cubic, etc. It is then a difficult problem to conjecture a batch of only a small number of functions, each one more complicated than the preceeding simple ones. This requires a careful study of the data, observation and much trial and error working while testing some functions supposed to be convenient.
Re: Regression and Maximums
JJacquelin,
I have numerous data and the number of points is not an issue here. What I'm looking for is to obtain a function (whatever it could be) that will present maximums at some specifics coordinates. My original data Y, exprimentally obtained veries in function of two coordinates (x and y). I know where the maximums needs to be (at which x) for each y. I don't really care about the values of the resulting function, only the maximums ... There is 7 maximums over the x interval to be found for each specific y. That's what I want to maximize. In addition to the Ys obtained, there is about 12 possibles parameter values that my regression can used for each X and Y. These are computed and their level of correlation is (obviously) unknown at the beginning. This being said we can, as we get closer to the solution, remove or add parameters. Since I automated the regression, adding or removing parameters is not such a big issue. In clear, what I am looking for is the math to make a regression that will maximize the maximums on the identified xs.
Here is the details:
I have over 12 computed series of values; E1 (x,y), E2 (x,y), E3(x,y), …, E12 (x,y)
Coordinates X varies from i to j (50 points each)
Coordinate Y varies from m to n (1000 series)
In addition to that, I know where the 7 maximums for each Y are (xi1 , xi2, …, xi7). These varies for each Y. For the first pass I use the following:
Max(x,y) = {0,1} , 1 where there should be a Max and 0 when there should not be.
Presently, I’m doing a regression by using a function
R(x,y) = A + B*E1 + C*E12 + D*E13 + E*E2 + F*E22 + G*E23 + …+ AI*E12 + AJ*E122 + AK*E123 (in the worst case).
But for now, let’s just say that we will use:
R(x,y) = A + B*E1 + C*E12 + D*E13
So the sum of the square will be
S = ∑ ∑ (Max – R)2 = ∑ ∑ (Max2 – 2*R*Max + R2)
dS/dA = 0
dS/dB = 0
dS/dC = 0
dS/dD = 0
But as I said, the maximum drift from one x to the next one, and I don,t get as many Maximums as I would like.
Now, knowing that dR/dx =0 and d2R/dx2 < 0 at xi1 , xi2, …, xi7 , for the maximums and that any other points will need to have either dR/dx =0 and d2R/dx2 > 0 OR dR/dx <>0, how could I reformulate my model to constraint my regression to provide as many maximums as possible ???
Re: Regression and Maximums
Well, it seems a tricky problem.
If fact, I cannot fully understand it, especially the symbols and notations.
Citation : << I have over 12 computed series of values; E1 (x,y), E2 (x,y), E3(x,y), …, E12 (x,y) >>
What is a “computed series” ? How is it computed ?
Citation : << Coordinates X varies from i to j (50 points each)
Coordinate Y varies from m to n (1000 series) >>
So, for one known value of Y, you have 50 values of X, and respectively 50 values of R. Right ?
Citation : << There is 7 maximums over the x interval to be found for each specific y. >>
So, for the same known value of Y, they are 7 maximums for the function of x only. Right ?
Citation : << Presently, I’m doing a regression by using a function
R(x,y) = A + B*E1 + C*E1^2 + D*E1^3 + E*E2 + F*E2^2 + G*E2^3 + …+ AI*E12 + AJ*E12^2 + AK*E12^3 >>
Suppose that you consider only one known value of Y and the corresponding 50 couples (X, R), then is you computation process able to find the 7 values x corresponding to the seven maximums (of course related to this particular Y only) ?
I doubt that it would work with a so small number (50) of couples (X,R). In fact, a function with 7 maximums and then 6 or 7, or 8 minimums, ( i.e. 13, 14 or 15 extremums) may require about 1000 couples (X,R) and probably much more, in order to achieve a reliable mean least squares regression.
Of course, if the kind of function was a-priori known, with in it only a few number of unknown parameters, the required number of points should be much lower. For example if the function was known as A+B*sin(C*x+D), only a few tens of points should be required). As far I can understand, it is not the case of your problem : the kind of function is not a-priori known.