No menu items!

Is Variance Affected by Outliers? Is It Resistant?

-

The Variance is not a resistant measure of spread for the data. This means that the variance is greatly affected by the presence of outlier values.

If one of the data values is replaced by an extremely large value then the variance also becomes extremely large.

Why is the Variance/Standard Deviation Affected by Outliers?

In order to understand why the variance is affected by the presence of outliers we must understand that the variance measures the degree of spread of the data values.

If the data values are far apart from each other then the variance is large and if the data values are close to each other then the variance is low.

So if we replace a small value with an extremely large value, the distance between the data values will become much greater which increases the variance.

Example:

Let us try to understand how the variance is affected by looking at an example. Consider the following set of data values: 1, 2, 3, 4, 5.

We first calculate the mean of the data values as follows,

Mean = (1+2+3+4+5)/5 = 15/5 =3.

The variance can then be calculated using the formula,

Variance = ∑(xi​−Mean)2/n = ∑(xi​−3)2/5.

So the variance is equal to,

V = ((-2)^2 + (-1)^2 + 0^2 +1^2 + 2^2)/5 = 10/5 = 2.

Now let us see what happens when one of the data values is replaced by an extremely large value.

Suppose that the data value 5 is replaced by the value 90. The new data set is 1, 2, 3, 4, and 90. It is clear that these data values have a greater “spread” and hence that variance will be much greater. We now calculate the new variance.

First, the new mean is calculated as follows,

New Mean = (1+2+3+4+90)/5 = 100/5 =20.

The new variance can then be calculated using the formula,

Variance = ∑(xi​− New Mean)2/n = ∑(xi​−20)2/5.

So the new variance is equal to,

V = [(-19)^2 + (-18)^2 + (-17)^2 +(-16)^2 + 70^2]/5 = 6130/5 = 1226.

We see that the variance drastically changes from a value of 2 to a value of 1226.

We conclude that variance and standard deviation are not resistant measures of spread. They are indeed affected by the presence of outlier values.

Summary
Article Name
Is Variance Affected by Outliers? Is It Resistant?
Description
The Variance is not a resistant measure of spread for the data. This means that the variance is greatly affected by the presence of outlier values. If one of the data values is replaced by an extremely large value then the variance also becomes extremely large.

Share this article

Recent posts

Popular categories

Recent comments