Log-Lin, Lin-Log, and Log-Log Regression

Hi! I'm having trouble understanding why you interpret the "log" part in the log-lin, lin-log, and log-log models as a percentage change. Let's say you have the regression equation Y = Beta_{0} + Beta_{1}log(x). I read that if you take the derivative of Y with respect to x you get Y' = Beta_{1} x (deltax/x), so the change in Y is equal to the coefficient estimate times the percentage change in x. What does the derivative of Y have anything to do with what value I plug in for x to get Y? I know that in a log-lin model as x increases, Y grows at a decreasing rate. But I don't how that is relevant to the regression equation.

So basically I'm asking why, if you have the regression equation Y = 30 + 50log(x), do you interpret the second term on the right-hand side as a percentage change in x is on average associated with a 0.5 unit increase in Y. I know that part of the derivative (deltax/x) mentioned above has everything to do with it, but I can't make the connection. Why am I looking at the rate of change? Can someone please explain it to me as simply as possible? Thanks.

Re: Log-Lin, Lin-Log, and Log-Log Regression

How do we determine what the percentage change is?

$100 * \dfrac{y_2 - y_1}{y_1} = \dfrac{100\Delta y}{y}$ is the percentage change in y, correct? But let's forget percentages. The **relative** change in y is $\dfrac{\Delta y}{y}.$

And the relative change in x is $\dfrac{\Delta x}{x}.$ Very easy. Now for a weird way to compute $\dfrac{\Delta y}{\Delta x}.$

$y * \dfrac{1}{x} * \left(\dfrac{\dfrac{\Delta y}{y}}{\dfrac{\Delta x}{x}}\right) = \dfrac{y}{x} * \dfrac{\Delta y}{y} * \dfrac{x}{\Delta x} = \dfrac{\Delta y}{\Delta x}.$ This is just some cancellation. May look a bit ugly, but nothing really hard.

$y = \beta_0 + \beta_1 ln(x) \implies \dfrac{dy}{dx} = \beta_1 * \dfrac{1}{x}.$ Basic calculus.

If $\Delta x$ is small, $\dfrac{dy}{dx} \approx \dfrac{\Delta y}{\Delta x} \implies$

$\beta_1 * \dfrac{1}{x} \approx y * \dfrac{1}{x} * \left(\dfrac{\dfrac{\Delta y}{y}}{\dfrac{\Delta x}{x}}\right) \implies \beta_1 \approx y * \left(\dfrac{\dfrac{ \Delta y}{y}}{\dfrac{\Delta x}{x}}\right) \implies \beta_1 * \dfrac{\Delta x}{x} \approx y * \dfrac{\Delta y}{y} \implies \beta_1 * \dfrac{\Delta x}{x} \approx \Delta y.$

In English, the change in y is approximately equal to $\beta_1$ times the relative change in x for small changes in x. Does this help?

Re: Log-Lin, Lin-Log, and Log-Log Regression

Thanks! That's an interesting way to look at it, but I still don't get why I'm looking at a percentage change. My real question, I guess, is why is deltax/x = 0.01 when I want to see the change in y? If I stick in 1 for deltax in Beta1*(deltax/x) why is x in the denominator supposed to be 100?

Also, if the derivative of y with respect to x is approximately equal to the change in y divided by the change in x for tiny changes in x, how are you supposed to look at your regression line if you want to see what happens to y if you increase x from 1% by 20%?

Re: Log-Lin, Lin-Log, and Log-Log Regression

First off, I think you are confusing percentage change and relative change. Percentage change is 100 times relative change. This whole talk about percentages makes things more complex. You have to multiply by 100 at the right times and divide by 100 at other times. Percentages do make sense in terms of presentation, but the underlying math has nothing to do with percentages. To show how confusing percentages are making this, what do you mean you increase x from 1% to 20%? Normally, x is a number, not a percentage. If you mean x increases from 1 to 20, that is not a small change in x. It is a relative change of 19 calculated as (20 - 1)/1, or a percentage change of 1900%. It is a huge relative increase.

Once you have your regression equation of $y = \beta_0 + \beta_1 ln(x)$,

finding y can be done without any reference to relative change or percentage change. You put x into the equation and get y.

I think the statement that is confusing you is something like this. When you have a log relationship, it means that (for small changes in x), the change in y is approximately proportional to the relative change in x whereas, in a linear relationship, the change in y is proportional to the change in x.

So let's say $\beta_1 = 950.$ If $\dfrac{\Delta x}{x} = 0.01$, relative change is small,

then the change in y $\approx 950 * 0.01 = 9.5,$ and the new y = the old y + 9.5. I can do the computation in my head.

Let's see how good that approximation is.

Old x = 100. New x = 101. Let's say $\beta_0 = 3119.$

$Old\ y = 3119 + 950 * ln(100) \approx 7493.9.\ New\ y = 3119 + 950 * ln(101) \approx 7503.4.\ So\ \Delta y \approx 7503.4 - 7493.9 = 9.5.$

Re: Log-Lin, Lin-Log, and Log-Log Regression

Quote:

Originally Posted by

**mathbeginner97** If I stick in 1 for deltax in Beta1*(deltax/x) why is x in the denominator supposed to be 100?

Upon rereading, I see you may have another source of confusion that I may have inadvertently compounded with the example in my previous post.

We have an equation $y = \beta_0 + \beta_1 ln(x).$ If we are interested in $\Delta y$ and $\Delta x$ is "small" relative to $x$, then we can quickly approximate $\Delta y$ by

$\Delta y \approx \beta_1 * \dfrac{\Delta x}{x}.$

How do we determine whether $\Delta x$ is "small" relative to $x$? This depends on how exact an answer we need (remember that when dealing with logs, we are not going to be exact anyway.)

For many practical purposes, if $\left|\dfrac{\Delta x}{x}\right| \le 0.05$, then the approximation is quite good.

Now if 0.01 (or 1%) is small enough for your purposes, then $\dfrac{\Delta x}{x} = 0.01 \iff x = 100 \Delta x \iff 0.01\Delta x = x.$

If you set $\Delta x = 500$ and $\dfrac{\Delta x}{x} = 0.01$, then $x = 100 * 500 = 50,000.$

If you set $\Delta x = 1$ and $\dfrac{\Delta x}{x} = 0.01$, then $x = 100 * 1 = 100.$

Re: Log-Lin, Lin-Log, and Log-Log Regression

Thanks so much! Now it makes a lot more sense, if I'm understanding it right that is. So basically when you have a lin-log model, and you look at the change in Y, you look at the relative change in x because that's the derivative of a log or ln function. deltay / deltax is approximately equal to the derivative of y with respect to x when the change is tiny. I read about first principles, and that's how they introduce how the derivative comes about so it makes sense when I think of why the two are approximately equal. Then when you do the thing you did in your first post, you find that Beta1*ln(x) is approximately the change in y when x is small. When you multiply Beta1 by that tiny number you roughly get the change in y. Is that right?

By the way, interpreting the coefficient B1 lin-log models is only relevant when x changes by a tiny amount like 0.01 or 0.03 right? Based on what I know and what I understand from this thread, if the relative change in x is like 2.5 (as opposed to 0.01), it doesn't work.

By the way, why did you say we don't need to be exact with logs in your last post?

Re: Log-Lin, Lin-Log, and Log-Log Regression

I don't want to start a new thread about this so I'll post my question here: Does anyone know how to break down the mechanics of a log-log or log-lin model like JeffM showed in the above posts?

When you have a log-lin model in the form of ln(y) = B_{0} + B_{1}x, I think you take an implicit derivative to get (1/y)*(dy/dx)= B_{1}

Then you rearrange and find dy/y = (dx)*B_{1} ... Now I'm stuck!

Re: Log-Lin, Lin-Log, and Log-Log Regression

Quote:

Originally Posted by

**mathbeginner97** Thanks so much! Now it makes a lot more sense, if I'm understanding it right that is. So basically when you have a lin-log model, and you look at the change in Y, you look at the relative change in x because that's the derivative of a log or ln function. deltay / deltax is approximately equal to the derivative of y with respect to x when the change is tiny. I read about first principles, and that's how they introduce how the derivative comes about so it makes sense when I think of why the two are approximately equal. Then when you do the thing you did in your first post, you find that Beta1*ln(x) is approximately the change in y when x is small. When you multiply Beta1 by that tiny number you roughly get the change in y. Is that right?

By the way, interpreting the coefficient B1 lin-log models is only relevant when x changes by a tiny amount like 0.01 or 0.03 right? Based on what I know and what I understand from this thread, if the relative change in x is like 2.5 (as opposed to 0.01), it doesn't work.

By the way, why did you say we don't need to be exact with logs in your last post?

I get nervous whenever anyone (including me) tries to translate mathematical results into a natural language. Mathematicians have spent millennia developing a vocabulary and notation that is both concise and relatively free of ambiguity, and translating math into natural language is likely to become both verbose and ambiguous. Not being a mathematician myself, however, I sympathize with those who want a non-mathematical summary of mathematical results. I am just not sure that it is always possible. I hope any other tutor will feel free to correct this post or to supplement it.

Quote:

Originally Posted by

**mathbeginner97** By the way, why did you say we don't need to be exact with logs in your last post?

You seem to be studying regression. Regression almost never gives an exact fit for the actual data. How the "errors" are interpreted varies: you can say that the actual data is subject to measurement error (almost certain to be true) or that other variables ignored in the regression have a real but minor influence (frequently true as well). So what you get out of a regression model is almost always an approximation anyway. Furthermore, what you get out of a log table or the log function in a calculator or computer is almost always an approximation as well. So we are dealing with approximations all the way around. Does that make sense? No need to get in a tizzy because we say that something is just an approximation.

Quote:

Originally Posted by

**mathbeginner97** deltay / deltax is approximately equal to the derivative of y with respect to x when the change is tiny. I read about first principles, and that's how they introduce how the derivative comes about so it makes sense when I think of why the two are approximately equal.

Yes. $\displaystyle \dfrac{dy}{dx} = \lim_{\Delta x \rightarrow 0}\dfrac{\Delta y}{\Delta x} \implies \dfrac{dy}{dx} \approx \dfrac{\Delta y}{\Delta x}\ for\ small\ enough\ \Delta x.$ How small is small enough depends on how exact your answer needs to be.

Quote:

Originally Posted by

**mathbeginner97** So basically when you have a lin-log model, and you look at the change in Y, you look at the relative change in x because that's the derivative of a log or ln function.

You may have the correct idea, but that statement is just incorrect. The derivative of ln(x) is 1/x, which is not the relative change in x. Here is the math.

$y = \beta_0 + \beta_1ln(x) \implies \dfrac{dy}{dx} = \beta_1 * \dfrac{1}{x} \implies \dfrac{\Delta y}{\Delta x} \approx \beta_1 * \dfrac{1}{x}\ for\ small\ enough\ \Delta x \implies$

$\Delta y \approx \beta_1 * \dfrac{\Delta x}{x}\ for\ small\ enough\ \Delta x.$ The change in y = $\Delta y$, and the relative change in x is $\dfrac{\Delta x}{x}.$

Now the way I would translate that into English is:

**the change in y is approximately proportional to the relative change in x, with a constant of proportionality of $\beta_1.$**

You are completely right that the approximation does not work for relatively large changes in x. If that is what you have to deal with, you need to use your actual regression equation, not the approximating shortcut.

Re: Log-Lin, Lin-Log, and Log-Log Regression

Quote:

Originally Posted by

**mathbeginner97** I don't want to start a new thread about this so I'll post my question here: Does anyone know how to break down the mechanics of a log-log or log-lin model like JeffM showed in the above posts?

When you have a log-lin model in the form of ln(y) = B_{0} + B_{1}x, I think you take an implicit derivative to get (1/y)*(dy/dx)= B_{1}

Then you rearrange and find dy/y = (dx)*B_{1} ... Now I'm stuck!

Error in this post. Look at Next.

Re: Log-Lin, Lin-Log, and Log-Log Regression

Quote:

Originally Posted by

**mathbeginner97** I don't want to start a new thread about this so I'll post my question here: Does anyone know how to break down the mechanics of a log-log or log-lin model like JeffM showed in the above posts?

When you have a log-lin model in the form of ln(y) = B_{0} + B_{1}x, I think you take an implicit derivative to get (1/y)*(dy/dx)= B_{1}

Then you rearrange and find dy/y = (dx)*B_{1} ... Now I'm stuck!

Stuck on what?

$ln(y) = \beta_0 + \beta_1 x \implies \dfrac{1}{y} * \dfrac{dy}{dx} = \beta_1.$

$So,\ for\ small\ \Delta x,\ \dfrac{1}{y} * \dfrac{dy}{dx} = \beta_1 \implies \dfrac{1}{y} * \dfrac{\Delta y}{\Delta x} \approx \beta_1 \implies \dfrac{\Delta y}{y} \approx \beta_1 \Delta x.$

In English, the relative change in y is approximately proportional to the change in x, provided the change in x is small enough. That explains the relationship between changes in y and small changes in x, but it does not help you use the formula to find y given some x.

$ln(y) = \beta_0 + \beta_1 x \implies y = e^u,\ where\ u = \beta_0 + \beta_1 x.$

And yes, we get the same result differentiating explicitly: $\dfrac{dy}{dx} = \dfrac{dy}{du} * \dfrac{du}{dx} = e^u * \beta_1 = \beta_1 y \implies \dfrac{1}{y} * \dfrac{dy}{dx} = \beta_1.$

As for log-log, let's figure out how to compute a y value from an x value first.

$ln(y) = \beta_0 + \beta_1 ln(x).$

$Let\ \gamma = e^{\beta_0},\ which \implies ln( \gamma ) = ln\left(e^{\beta_0}\right) = \beta_0 * ln(e) = \beta_0 * 1 = \beta_0.$

$So,\ ln(y) = \beta_0 + \beta_1 ln(x) = ln( \gamma) + \beta_1 ln(x) = ln( \gamma) + ln\left(x^{\beta_1}\right) = ln\left(\gamma x^{\beta_1}\right) \implies y = \gamma x^{\beta_1}.$

y is proportional to a power of x, with a constant of proportionality of $e^{\beta_0}$ and a power of $\beta_1.$

$y = \gamma x^{\beta_1} \implies \dfrac{dy}{dx} = \gamma \beta_1 x^{(\beta_1 - 1)} = \gamma \beta_1 * \dfrac{x^{\beta_1}}{x} = \beta_1 * \dfrac{\gamma x^{\beta_1}}{x} = \beta_1 * \dfrac{y}{x}.$

$So,\ for\ small\ \Delta x,\ \dfrac{dy}{dx} = \beta_1 * \dfrac{y}{x} \implies \dfrac{\Delta y}{\Delta x} \approx \beta_1 * \dfrac{y}{x} \implies \dfrac{\Delta y}{y} \approx \beta_1 \dfrac{\Delta x}{x}.$

In English, the relative change in y is approximately proportional to the relative change in x, provided the change in x is small enough.

In lin-lin, the change in y and the change in x are strictly proportional.

In lin-log, the change in y is approximately proportional to the relative change in x (for small changes in x).

In log-lin, the relative change in y is approximately proportional to the change in x (for small changes in x).

In log-log, the relative change in y is approximately proportional to the relative change in x (for small changes in x).

See there was a reason why you studied logs way back when.

Re: Log-Lin, Lin-Log, and Log-Log Regression

Thank you so much, again! I'm in highschool and I'm doing data management courses. I haven't done calculus yet so the teacher can't explain why certain stuff happens, but I was interested so I watched Khan Academy videos about calculus. It turns out it's really useful and not so bad at a basic level. Maybe next year or in college I'll learn how regression works. At this point all I know is how the equations for the lines look like, and thanks to you, how to interpret some of them.

By the way, how do you know when to do certain things when figuring some problems out? Like the "messy" cancellation thing in your first post, and making B_{0} into ln(e^{B0}) in your last post?

Re: Log-Lin, Lin-Log, and Log-Log Regression

Quote:

Originally Posted by

**mathbeginner97** Thank you so much, again! I'm in highschool and I'm doing data management courses. I haven't done calculus yet so the teacher can't explain why certain stuff happens, but I was interested so I watched Khan Academy videos about calculus. It turns out it's really useful and not so bad at a basic level. Maybe next year or in college I'll learn how regression works. At this point all I know is how the equations for the lines look like, and thanks to you, how to interpret some of them.

By the way, how do you know when to do certain things when figuring some problems out? Like the "messy" cancellation thing in your first post, and making B_{0} into ln(e^{B0}) in your last post?

I suspect that the transition from algebra to calculus is less hard for students than the transition from arithmetic to algebra. (I have no scientific basis for that assertion.) In any field of study at the high school level, there are usually some things that cannot be explained because they depend on other fields of study that have not yet been broached. Although unavoidable, it is frustrating for students, particularly good students.

You don't KNOW how to solve problems in the abstract. What you do is to learn a bunch of techniques that sometimes help solve problems, and you apply the techniques that seem applicable to the problem at hand. In many cases, you know where you want to end up, at least approximately, so you at least have a sense of direction and that lets you pick the more promising techniques first. Problem solving is a **creative** process unless you recognize immediately what technique to apply. A lot of high school math is just teaching you techniques that will come in handy for solving various types of problem so the creative aspect is minimized.

I suspect if you look back you will see that I primarily used just five techniques or tools:

(1) Cross-multiplying. Probably learned in grade school.

(2) $ln(a) = b \iff a = e^b.$ Learned in high school.

(3) Laws of logs. Learned in high school.

(4) $\dfrac{dy}{dx} \approx \dfrac{\Delta y}{\Delta x}\ for\ small\ enough\ \Delta x.$ Learned in differential calculus.

(5) $y = ln(x) \implies \dfrac{dy}{dx} = \dfrac{1}{x}.$ Learned in differential calculus.