What is the R-squared value if you include points 1 through 4?
Hi all, I have a curve that looks like this:
Let's name the points from left to right starting with 1.
I have to calculate the gradient of the line of best fit of the linear region. From the graph, it appears that the linear region is within points 2 and 4. However, the R-squared value of a line of best fit generated from points 1 to 3 is 0.99 whereas that from points 2 to 4 is only 0.92, as calculated by Microsoft Excel.
So should I choose points 1 to 3 or 2 to 4 as the linear region?
I think I have got it. The X-axis has been log-ed. What appears to be linear on the log scale is in fact not on the linear scale.
But I have another question. Which do you think it's a more accurate approach to obtain the line of best fit? Using 3 points with an R-squared value of 0.99, or more points (only those that fall within the region that looks linear on the linear scale) with a lower R-squared value, but still above 0.9?
Thanks in advance!
Thanks Ackbeet! You have been a great help! I have yet another question on this topic.
Previously I had used linear regression model to obtain the line of best fit of the plots. But I realised that a 4- or 5-parameter logistic regression model would be more appropriate, and more importantly, it is commonly used in my kind of work (Immunoassays). The curves now look like this:
I have to compare the 5 curves in this 1 graph, and then compare each corresponding 5 curves among 3 sets of such graph. Initially I had wanted to calculate the y-values for comparison by choosing a particular x-value that lies within the 'linear' region of all 5 curves of all 3 graphs, but I realised this is not possible even in just 1 graph alone as the 'linear' region of some curves lie beyond that of others. Just to clarify my point, I can use an x-value of 10,000 for the 3 upper curves, but this x-value would not lie in the 'linear' region of the 2 lower curves in the graph above.
So, instead of using any particular x-value, I am thinking of using the EC50 value (I am not sure what is this called in Mathematics, but just in case you do not know what an EC50 is, please refer to EC50 - Wikipedia, the free encyclopedia). By using this value, I will be comparing the x-values now, as opposed to the y-values as in the previous impossible method (only because the software that I am using does not show the corresponding y-values). This is not the problem, for a qualitative comparison, I think (or rather I hope).
However, using the EC50 value just doesn't seem right to me in making a valid comparison. I can't get over the fact that both the x-values and y-values are different and yet the x-values for being compared, although I am trying to make sense out of how both the x- and y-values of the EC50 are a function of the 'height' of the curve.
I would very much appreciate your valuable opinion on making a valid comparison based on the EC50 value.
Thank you very much in advance!
Unfortunately, I don't think I know enough about your method to say anything very intelligent about it. Maybe CB or mr fantastic could weigh in on this.I would very much appreciate your valuable opinion on making a valid comparison based on the EC50 value.