The correlation coefficient (denoted as ‘r’) describes the relationship between two variables X and Y. If the value of r is positive it means that the variables are positively correlated and if the value of r is negative it means that the variables are negatively correlated. Some of the properties of Karl Pearson’s coefficient of correlation are as follows:

1. The value of the correlation coefficient between two variables X and Y can be calculated using the formula,

r_{xy} = \frac{n\sum xy - (\sum x)(\sum y)}{\sqrt{n\sum x^2 - (\sum x)^2}\sqrt{n\sum y^2 - (\sum y)^2}}Watch the following video to see an example of how to calculate the correlation coefficient using the above formula:

2. The correlation coefficient is symmetric between the variables X and Y. That is, r_{xy} = r_{yx}.

3. The value of the correlation coefficient always lies between -1 and +1.

-1 \leq r_{xy} \leq +14. The correlation coefficient remanis unaffected by change of scale or change of origin. For example, if u = \frac{x-a}{h} and v=\frac{y-b}{k} then,

r_{xy}=r_{uv}5. The correlation coefficient is a unitless quantity.

6. The correlation coefficient is the geometric mean of the two regression coefficients,

r = \sqrt{b_{xy} \times {b_yx}}7. If the correlation coefficient is zero, it does not mean that the two variables are not dependent. It only shows that there is no** linear** dependence between the two variables. The two variables may still be dependent on each other via a non-linear relationship. For example, suppose we calculate the correlation coefficient for the following data values:

X | Y |

-1 | 1 |

0 | 0 |

1 | 1 |

We can calculate that r=0 for the given data, but the variables X and Y are dependent on each other via the relationship y = x^2.

8. We can calculate the coefficient of determination r^2. which tells us about the explanatory power of our regression model. Suppose we construct a regression model of the dependent variable Y on the independent variable X. If the value of r^2 is 0.87, then it means that 87% of the variation in the variable Y is explained by the independent variable X.