The five-number summary for a given set of data is a set of 5 representative numerical values obtained from the data which gives us an idea of how the data looks like and how it is distributed.
The main advantage of the 5-number summary is that instead of tediously going through the entire data we can use it to get an overview of the data at just a glance.
The 5-number summary consists of the following 5 pieces of information:
- The smallest value.
- The lower quartile (the value beneath which 25% of data values lie).
- The median (the value beneath which 50% of the data values lie).
- The lower quartile (the value beneath which 75% of data values lie).
- The highest value
Some outliers in the data may also be included in the five-number summary. This is because it contains the highest and lowest values which may turn out to be outliers.
How to calculate the five-number summary?
The procedure to calculate the five-number summary is to simply arrange our data from highest to lowest and then find the median and the lower and upper quartiles.
A five-number summary is sometimes visualized using a box plot with “whiskers” attached to it.
Example:
Calculate the five-number summary for the given set of data:
1, 3, 4, 17, 19, 21, 45, 77, 79.
Solution:
As the data is already in ascending order we clearly see that the highest and lowest values are 84 and 1 respectively.
We have N=9 observations in the given data.
Median= (N/2)th term = 4.5th term, that is the 5th term = 19
Lower quartile= (N/4)th term = 2.25th term, that is the 2nd term = 3
Upper quartile= (3N/4)th term = 6.75th term, that is the 7th term = 45.
So our five-number summary for the data is,
1 – 3 – 19 – 45 – 84.
Seven Number Summary:
Sometimes if a person desires to obtain more information about the data they use the “seven summary summary”.
The 7-number summary consists of the 2nd, 9th, 25th, 50th, 75th, 91st, and 98th percentiles and gives us a much more accurate picture as compared to the five-number summary.