# Linear Correlations

• Nov 10th 2009, 03:39 PM
lexx
Linear Correlations
The following set of data relates mean word length and recommended age level for a set of children's books.
recommended age level - Mean Word Length
4 - 3.5
6 - 5.5
5 - 4.6
6 - 5.0
7 - 5.2
9 - 6.5
8 - 6.1
5 - 4.9
a. Create a scatterplot and classify the linear correlation
b. determine the correlation coefficient
c. determine the line of best fit
d. use this model to predict the average word length in a book recommended for 12 year olds

I'm pretty sure i've figured out that a. is a positive linear correlation. However when it comes to solving the other questions i'm not sure what data to imput into the formulas or what the formula's are. Help with these last questions on which formulas to use to compute the questions would be greatly appreciated!
• Nov 11th 2009, 11:31 AM
ANDS!
Quote:

Originally Posted by lexx
The following set of data relates mean word length and recommended age level for a set of children's books.
recommended age level - Mean Word Length
4 - 3.5
6 - 5.5
5 - 4.6
6 - 5.0
7 - 5.2
9 - 6.5
8 - 6.1
5 - 4.9
a. Create a scatterplot and classify the linear correlation
b. determine the correlation coefficient
c. determine the line of best fit
d. use this model to predict the average word length in a book recommended for 12 year olds

I'm pretty sure i've figured out that a. is a positive linear correlation. However when it comes to solving the other questions i'm not sure what data to imput into the formulas or what the formula's are. Help with these last questions on which formulas to use to compute the questions would be greatly appreciated!

What book are you using that it doesn't give you the computational formula to figure out the equation? You can also alternatively use a TI-83.
• Nov 12th 2009, 12:51 PM
lexx
the book i'm using only gives me the option on how to use a TI - 83, I don't have a graphing calculator so it's hard for me to compute using this method. That's why I'm trying to figure out how to solve outside of the calculator, the paper and pencil method.
• Nov 13th 2009, 04:19 PM
ANDS!
Ok well you should be able to construct the scatter plot. That's no problemo.

The correlation coefficient "r" is calculated in a very tedious way if you do not have access to a calculate. It is:

$\displaystyle \frac{n\Sigma(xy)-(\Sigma(x))(\Sigma(y)}{\sqrt{[n(\Sigma(x^2))-(\Sigma(x))^2][n(\Sigma(y^2))-(\Sigma(y))^2]}}$

Does that look awful. Well yes it does. It's actually not really (if you know what all of that means), but I imagine your instructor wants you to have familiarity with the computations here. Here is what I would do:

Create a list on a sheet of paper. In one column write down the X-values. In the other column write down the Y-values. Next to the y-values create another column called "x*y". Then for each pair of data, do just that - x*y. Next to the x*y, do x-squared, and next to x-squared, y-squared. Go down through the list of paired data and perform each line of operations. At the bottom, sum each column up. Once you have done that, all you need to do is take the sum of the columns, and plug them into that ugly equation. Boom - you have your correlation coefficient.

For C., do you know how to determine the line of best fit? It's another ugly bit of equations, but with that column chart, you're just plugging in values. And D., is simply a matter of you using your new equation to make an estimate.
• Nov 18th 2009, 04:10 PM
lexx
Quote:

Originally Posted by ANDS!
Ok well you should be able to construct the scatter plot. That's no problemo.

The correlation coefficient "r" is calculated in a very tedious way if you do not have access to a calculate. It is:

$\displaystyle \frac{n\Sigma(xy)-(\Sigma(x))(\Sigma(y)}{\sqrt{[n(\Sigma(x^2))-(\Sigma(x))^2][n(\Sigma(y^2))-(\Sigma(y))^2]}}$

Does that look awful. Well yes it does. It's actually not really (if you know what all of that means), but I imagine your instructor wants you to have familiarity with the computations here. Here is what I would do:

Create a list on a sheet of paper. In one column write down the X-values. In the other column write down the Y-values. Next to the y-values create another column called "x*y". Then for each pair of data, do just that - x*y. Next to the x*y, do x-squared, and next to x-squared, y-squared. Go down through the list of paired data and perform each line of operations. At the bottom, sum each column up. Once you have done that, all you need to do is take the sum of the columns, and plug them into that ugly equation. Boom - you have your correlation coefficient.

For C., do you know how to determine the line of best fit? It's another ugly bit of equations, but with that column chart, you're just plugging in values. And D., is simply a matter of you using your new equation to make an estimate.

Okay so I figured out the correlation pretty easily after finding the equation, thank you for that. However, I'm now stuck on determining the line of best fit, this is how i went about it.

Slope = n(Sigma(xy)) - (Sigma(x)) / n(Sigma(x2)) - (Sigma(x))2
= 8(268.2) - (50)(41.3) / 8(332) - (50)2
= 2145.6 - 2065 / 2656 - 2500
= 80.6 / 156
= 0.52

Intercept = Sigma(y) - m(Sigma(x)) / n
= 41.3 - 0.52(50) / 8
= 1.9125

y = mx + b
y = 0.52x + 1.91

but however if i go to figure out what the average word length is for a 12 year-old i get 4.33... that's even less then what's recommended for a 6 year old?!?!?!