# Thread: Regression models that ignore regressors

1. ## Regression models that ignore regressors

Hi,

Suppose X, Y are independent discrete random variables with jont distribution P(X, Y). My regression model is of the form

$Z = E[Z | X, Y] + \epsilon = f(X, Y) + \epsilon,$

where the random variable $\epsilon$ is the noise-term. Next, assume that we ignore our knowledge about $Y$. The we have another regression model

$Z' = E[Z' | X] + \eta = g(X) + \eta,$

where $\eta$ is the noise term. We can express $g(X)$ as

$g(X) = \sum_y P(Y = y) f(X, y).$

My first question is: How can we interpret the right hand side of the last equation? Since X and Y are independent, we have for each x

$g(x) = \sum_y P(y) f(x, y) = \sum_y P_{Y | X}(y | x) f(x, y) = E[f(X, Y) | X = x].$

So the function g at x that has no knowledge about Y can be regarded as the conditional expectation of f given X = x. Is this argumentation correct? If independence does not hold, how do you call the expression

$\sum_y P(y) f(x, y),$

which looks a bit like an expectation?

The last question is concerned with the error term $\eta$. From the above, we have

$Z' = \sum_y P(y) Z - \sum_y P(Y)\epsilon - \eta.$

If I assume that $Z' = \sum_y P(y) Z$, then I may conclude

$\eta = \sum_y P(Y)\epsilon.$

But under which conditions is my assumption about Z' valid?

Thanks und best wishes,

samosa