Chauvenets Criterion is a statistical test that can be used to find the outliers in the given data. The Chauvenets test assumes that the data is normally distributed. The Chauvenets criterion provides an upper bound such that, if the data point lies beyond the bound it is classified as an outlier.
Procedure to conduct the Chauvenets criterion test:
Step 1: Identify the data point that you want to test for being an outlier
Step 2: Calculate the mean and standard deviation of the data.
Step 3: Calculate the value of the Chauvenets test statistic (essentially the Z score) using the formula,
Step 4: Find the critical Chauvenet value at a 5% level of significance for given sample size from the table below,
Step 5: If the test statistic exceeds the upper bound obtained from the table, then we conclude that that data point is an outlier.
Example: Identify the outlier values in the following set of data:
5, 14, 16, 19, 20, 22
Solution: Here n=6. Let us check whether the data point 5 is an outlier or not.
Step 1: We calculate the mean and the standard deviation of the data and obtain,
Standard deviation= 5.57
Step 2: The value of the Chauvenets test statistic for the data point 5 is,
Test statistic = |5-16|/ 5.57 = 11/5.57 = 1.97
Step 3: From the above table the critical value for n=6 is,
Critical value = 1.732
Step 4: Since the test statistic exceeds the table value, we conclude that the data point ‘5’ is a spurious outlier. We can similarly test the other data points individually to determine whether they are outliers or not,
Logic behind the Chauvenets criterion:
The logic behind the Chauvenets test is that if the population is normally distributed then the data has a higher chance of lying within certain standard deviations of the mean. The data has a much lesser probability of lying below that bound. So if it does lie beyond the bound it is reasonable to conclude that that data point is a spurious outlier.
Other methods to identify outliers:
Some other methods that can be used to identify outliers are:
- The 1.5 IQR rule
- Dixons Q test.
- Grubbs test.
- Visually identifying outliers by drawing scatterplots.