1. ## regressions

Suppose that you wish to use a linear regression to predict the dependent variable Y using the dependent variable X1. You collect a scatter plot of points (X1,i ,Yi) and notice that the plot seems to be well represented by a piecewise continuous linear function that satisfies the following conditions: For X1,i<=xo, the appropriate linear model appears to be Y1=a+B1X1,i+E1. However, when X1,1>xo the slope of the applicable linear relationship appears to change to B1+B2. Hint: First, determine the applicable form of the linear model that represents the plot of points (X1,i ,Yi) when X1,i>xo and note that both the slope and Y-intercept of the model will differ from the model applicable to the set of points (X1,i ,Yi) when X1,i<=xo. Now see if you can devise a way to combine these two linear relationships into a single multiple regression model by using the indicator variable X2,i where X2,i= +1 if X1,i >xo and X2,i=0 otherwise

a. There is no way to combine these two linear relationships into a single multiple regression. You should instead run each regression separately as simple regressions.
b. The combined model is: Y1= a + B1 X1,i + B2 X1,i X2,i + Ei
c. The combined model is: Y1= a + B1 X1,i + B2 (xo - X1,i) X2,i + Ei
d. The combined model is: Y1= a + B1 X1,i + B2 (a - xo - X1,i) X2,i + Ei
e. The combined model is: Y1= a + B1 X1,i + B2 (X1,i - xo) X2,i + Ei

2. I started to read this yesterday but I immediately saw y and x both dependent?
and don't you mean X1,i>xo not X1,1>xo?
You need to type this more carefully.

This is just curve fitting, not stats.
You want to incorporate both models using an indicator function $x_2$.

You want $y=a+b_1x_1$, when $x\le x_0$

and you want $y=a+(b_1+b_2)x_1$, when $x>x_0$, don't worry about the $\epsilon$.
But, I'm concerned about using the same y-intercept here, a.

BUT it is the same a.
When $x_1=0$, we have the same intercept.
So I would go with (b).

3. Thank you, and sorry about the typos. You have been a big help.

4. Hey guys, I am actually trying to find an answer to this exact same question as well. I messaged matheagle earlier, sorry for the confusion. It appears as if btnh and I are doing a similar assignment... Any help on this problem is appreciated.

5. B2 can be anything that means that the two slopes are completely different say m=B1+B2, but the a is a problem.

we have y=a+B1x for x<x0 and y=a+mx for x>x0 and these two different a's should not be the same.
However if x0>0 then the intercept is in the first region.

I feel that.... b. The combined model is: Y1= a + B1 X1,i + B2 X1,i X2,i + Ei
will estimate both B1 and B2 properly but I doubt it will estimate a correctly in both cases.

There is one real way to do this.
I can run a regression on b-e, but that means I need to this each equation via matrices $\hat\beta=(X^tX)^{-1}X^tY$.

6. Yes that is true. However, does the model's ability to estimate alpha properly affect the solutions? I don't know if that matters and would cause a difference? I guess I am torn between 'A' and 'B'. However, 'A' simply says that the relationships can't be combined, but I'm not sure that this is true. 'A' or 'B'...that is the question...

7. Also, it seems that 'E' could be an answer as well. If you contruct a basic math model and plug in values, 'E' gives numbers that could very well be correct. What do you think matheagle?

8. I'm still leaning towards
b. The combined model is: Y= a + B1 X1,i + B2 X1,i X2,i + Ei
When X1 is less than x0, then X2 is zero and the model reduces to
Y= a + B1 X1,i + Ei
and when X2 is 1, we have
Y= a + B1 X1,i + B2 X1,i + Ei =a + (B1 + B2) X1,i + Ei
and what I'm not sure, but my guts say that (b) is right since
a is the same intercept in either case.
FOR if X1=0 we get the SAME intercept, a.