No menu items!

R vs R-Squared – How are they Different?

-

The main difference between R and R2 is the following:

  • The quantity R is the Karl Pearson coefficient of correlation and it measures the degree of correlation between two variables X and Y.
  • The quantity R2 is the coefficient of determination. It measures the degree to which the independent variable X predicts the value of the dependent variable Y in linear regression.
  • The value of R2 is always less than the absolute value of R.

Interpretation of the Coefficient of Determination R2:

Suppose that we have an independent variable X and a dependent variable Y. We can use the technique of linear regression to quantify the relationship between one or more predictor variables and a response variable.

For example, we can have that:

Y = Crop Yield.

X = Amount of Fertilizer.

The yield of crops in an agricultural farm clearly depends on the amount of fertilizer used when growing the crops. Hence, Y is the dependent (response) variable and X is the independent (predictor) variable.

We can fit a linear regression model in order to predict the values of the variable Y. The regression model is given by the equation:

Y = β1X + β0 + e.

  • Here, β0 and β1 are the regression coefficients.
  • e is the error term.

We can use the linear model above to predict the value of Y. The coefficient of determination R2 measures the amount of variability in the dependent variable Y that is explained by the independent variable X.

For instance, suppose that we are given the following set of data values:

Amount of Fertilizer Crop Yield
13
26
58
411
119
84
1519
1322

We can calculate the regression coefficients and the coefficient of determination.

Regression Equation: ŷ = 1.03357X + 2.62739.

Coefficient of determination: R2 = 0.6118.

This means that only 61.118% of the variation in crop yield (Y) is explained by the variation in the amount of fertilizer (X). This suggests to us that there are other factors affecting crop yield. For instance, one such factor affecting crop yield might be the amount of water.

Thus we see that R2 can help us decide if the model is missing some independent variables.

It can be used as a measure of the strength of the model. If the value of R2 is high it means that the model has good predictive value. It can be used to predict the values of Y with a high degree of accuracy. If the value of R2 is low it means that the model has low explanatory power.

Understanding the Meaning of R:

The coefficient of correlation R on the other hand only measures the degree of correlation between the two variables. It does not measure the extent to which one variable can be used to predict the other.

For example, for the above set of data values for the crop yield and the amount of fertilizer, we have,

Correlation Coefficient: R = 0.7822.

This means that there is a high degree of correlation between the two variables. Note that we cannot say that X explains 78.822% of the variation in Y. As we have seen above the actual proportion of variation of Y due to X is 61.118%.

Summary
Article Name
R vs R-Squared - How are they Different?
Description
The main difference between R and R2 is the following: The quantity R is the Karl Pearson coefficient of correlation and it measures the degree of correlation between two variables X and Y. The quantity R2 is the coefficient of determination. It measures the degree to which the independent variable X can be used as a predictor for the value of the dependent variable Y in regression. The value of R2 is always less than the absolute value of R.

Share this article

Recent posts

Popular categories

Recent comments