We can get a sense of the concentration of the data about the center of the distribution using averages or measures of central tendency. Despite their enormous value in statistical analysis, they have some drawbacks. If we are simply provided the average of a set of observations, we will not have a thorough understanding of the distribution because there may be several distributions with averages that are identical but that differ significantly from one another in some respects. It is clear that the central tendency measures fall short of fully describing the distribution.
Measures of Dispersion – Definition
The complete story of the data is not revealed by an average. It is hardly a complete representation of the data unless we are aware of how the different data values disperse around it. If we are to determine how representative the average is, a more thorough explanation of the series is required.
As a result, extra measurements known as the measures of dispersion must be used in addition to and to support the measures of central tendency. To determine whether the distribution is homogeneous (compact) or heterogeneous (scattered), we look at dispersion.
The degree of the scatter or variation of the variables around a central value is known as dispersion or spread. The variance or dispersion of the data is the measure of how widely distributed numerical data tend to be around an average value.
Objectives and Significance of Measures of Dispersion
1. To determine whether an average is reliable.
We can determine whether the average is representative of the data using the measures of variation. Dispersion, as previously mentioned, provides information on the distribution of data around an average value. The average can be regarded as dependable in the sense that it gives a reasonably accurate estimate of the average for the related population if the dispersion is small, which indicates that the given data values are closer to the average (central value). The data values will be further away from the central value if the dispersion is high, suggesting that the average is not accurate and therefore not very reliable.
2. To minimize the data’s deviation from the central value.
In order to control the variation itself, the measures of variation assist us in identifying its nature and causes. It aids in determining how many various tasks carried out in industries deviate from the norm in terms of quality. For instance, we use 3-sigma control limits to assess the controllability of a manufacturing process. This enables us to pinpoint the root reasons for variance in the manufactured good and take appropriate corrective and remedial action. As another example, after carefully examining the distribution of income and wealth, the government can make the necessary policy measures to eliminate the disparities in the distribution of income and wealth.
3. To compare the variability of two or more sets of data.
Even if they are measured in different units, two or more distributions can be compared using the relative measures of dispersion to determine how variable or uniform they are.
4. To get additional statistical metrics for data analysis.
Many statistical measures that are frequently used in Correlation Analysis, Regression Analysis, Theory of Estimation and Testing of Hypotheses, Statistical Quality Control, and other applications are computed using the measures of variation.
Characteristics of Ideal Measures of Dispersion
- It ought to be clearly defined.
- It must be simple to calculate and comprehend.
- It ought to be based on all of the observations.
- It ought to be open to additional mathematical analysis.
- It should be as little impacted by sample variation as possible.
- It shouldn’t be significantly impacted by extreme observations