# Thread: spearman's rank correlation formula

1. ## spearman's rank correlation formula

Please, how do i prove the spearman's rank correlation formula?

2. ## Re: spearman's rank correlation formula

When you did not get an immediate answer you just post the same question again?

Did you at least read the rules for this forum (when you register, you have to click on a button saying that you have read them. Did you lie?) They tell you that you have to first make a try to solve the problem yourself, state the problem as clearly as possible, and show what you have tried.

It's been a long time since I have done anything in statistics (and not much then) but when I googled "Spearman's rank correlation formula" I immediately go this: https://geographyfieldwork.com/SpearmansRank.htm

Calculating the coefficient:

Create a table from your data.

Rank the two data sets. Ranking is achieved by giving the ranking '1' to the biggest number in a column, '2' to the second biggest value and so on. The smallest value in the column will get the lowest ranking. This should be done for both sets of measurements.

Tied scores are given the mean (average) rank. For example, the three tied scores of 1 euro in the example below are ranked fifth in order of price, but occupy three positions (fifth, sixth and seventh) in a ranking hierarchy of ten. The mean rank in this case is calculated as (5+6+7) ÷ 3 = 6.

Find the difference in the ranks (d): This is the difference between the ranks of the two values on each row of the table. The rank of the second value (price) is subtracted from the rank of the first (distance from the museum).

Square the differences (d²) To remove negative values and then sum them (d²).

Calculate the coefficient (R) using the formula below. The answer will always be between 1.0 (a perfect positive correlation) and -1.0 (a perfect negative correlation).

When written in mathematical notation the Spearman Rank formula looks like this :
$(R)= 1- \frac{6\sum d^2}{n^3- n}$.
That is, given a set of data, each data point having two numeric properties (the example cited here give the distance from a given museum to a convenience store and the cost of a bottle of water at that convenience store), you rank each of the properties in order. The convenience store farthest from the museum is ranked 1, the next farthest, etc. until the closest is ranked "n" where n is the number of convenience stores sampled. For cost, the store with the highest cost is ranked 1, the next highest is ranked 2, etc. until the least expensive is ranked n.

Now subtract the two orders. Which is subtracted from which is irrelevant because we then square each and then add them. That's the "$\sum d^2$" in the formula. n is the number of data points.