The correlation coefficient (denoted as ‘r’) describes the relationship between two variables X and Y. If the value of r is positive it means that the variables are positively correlated and if the value of r is negative it means that the variables are negatively correlated.
Some of the properties of Karl Pearson’s coefficient of correlation are as follows:
1) The value of the correlation coefficient between two variables X and Y can be calculated using the formula,
Here, Cov(X,Y) is the covariance and SD is the standard deviation.
2. The correlation coefficient is symmetric between the variables X and Y. That is, rxy = ryx.
3. The value of the correlation coefficient always lies between -1 and +1.
-1 ≤ rxy ≤ +1.
4. The correlation coefficient remains unaffected by a change of scale or change of origin. For example, if u = x-a/h and v=y-b/k then,
rxy = ruv.
5. The correlation coefficient is a unitless quantity.
6. The correlation coefficient is the geometric mean of the two regression coefficients,
r2 = bxy*byx.
7. If the correlation coefficient is zero, it does not mean that the two variables are not dependent. It only shows that there is no linear dependence between the two variables.
The two variables may still be dependent on each other via a non-linear relationship. For example, suppose we calculate the correlation coefficient for the following data values:
We can calculate that r=0 for the given data, but the variables X and Y are dependent on each other via the relationship y = x2.
8. We can calculate the coefficient of determination r2 which tells us about the explanatory power of our regression model.
Suppose we construct a regression model of the dependent variable Y on the independent variable X. If the value of r2 is 0.87, then it means that 87% of the variation in the variable Y is explained by the independent variable X.