I am estimating a time-series regression

$\displaystyle log(q_t) = \beta_0 + \beta_1 log(y_t) + \beta_2log(p_t) + \beta_3 d82 + \beta_4log(q_{t-1}) + \varepsilon_t $

where q - per capita consumption of cigarettes, y - per capita income, p - price of cigarettes and d82 is a dummy for any year after 1982 when new smoking laws were introduced. Without the lagged value of q in the regression there is significant serial correlation in the errors.

I have tested log(q), log(p) and log(y) for unit roots and have not had enough evidence to reject the null of no unit root in any of the series.

I've been asked to check for cointegration between the three series log(q), log(p) and log(q) by using the Engle Granger process and I've been told to include d82 in the formula. (I'm aware this is inferior to the Johansen method but we are keeping it simple)

Here's the question. I am not sure whether I should include the lag of q in my cointegrating regression as it was necessary before. Are there any problems with doing so? If i include the lag of q, my cointegrating regression becomes just my original model. With the residuals from this model I reject the null of a unit root and say the evidence points to the three being cointegrated.

However, when I don't include the lag of q the results are much different and I can't say there isn't a unit root.

Can anyone give me any guidance on whether or not I can include the lagged variable of q in the cointegrating regression and whether or not that is optimal? I can't find much on this topic.

Thanks