The five number summary for a given set of data is a set of 5 representative numerical values obtained from the data which give us an idea of how the data looks like and how it is distributed.
The main advantage of the 5 number summary is that instead of tediously going through the entire data we can use it to get an overview of the data at just a glance.
The 5 number summary consists of the following 5 pieces of information:
- The smallest value.
- The lower quartile (the value beneath which 25% of data values lie).
- The median (the value beneath which 50% of the data values lie).
- The lower quartile (the value beneath which 75% of data values lie).
- The highest value
Some outliers in the data may also be included in the five number summary. This is because it contains the highest and lowest values which may turn out to be outliers.
How to calculate the five number summary?
The procedure to calculate the five number summary is to simply arrange our data from highest to lowest and then find the median and the lower and upper quartiles. A five number summary is sometimes visualized using a box plot with “whiskers” attached to it.
Example: Calculate the five number summary for the given set of data:
Solution: As the data is already in ascending order we clearly see that the highest and lowest values are 84 and 1 respectively. We have N=9 observations in the given data.
Median= (N/2)th term=4.5th term, that is the 5th term=19
Lower quartile= (N/4)th term=2.25th term, that is the 2nd term=3
Upper quartile= (3N/4)th term=6.75th term, that is the 7th term=45.
So our five number summary for the data is 1 – 3 – 19 – 45 – 84
Seven Number Summary:
Sometimes if a person desires to obtain more information about the data they use the “seven summary summary”. The 7 number summary consists of the 2nd,9th,25th,50th,75th,91st, and 98th percentiles and gives us a much more accurate picture as compared to the five number summary.