Box Plots
A common method for presenting data is with box plots (Figure 9). At first
glance these plots can be intimidating but once you know what each aspect
of the box plot represents you can tell a lot about the distribution of
the data. These plots consist of three components: 1) the box, which encompasses
the middle 50% of the data, 2) the horizontal line within the box, which
represents the median value, and 3) the vertical lines extending above
and below the box, which indicate maximum and minimum values respectively.
The location of these components relative to each other illustrates how
the data is distributed. We will now look at four examples presented in
Figure 9, describe the average, median, minimum, and maximum values and
investigate how the data influences the box plot.
Figure 9. Examples of Box Plots
Example 1 is what is known as an
even distribution (Figure 9). The individual values are; 2 3 4 5 6
7 8 and 9. The average of these values is 5.5 which also happens to
be the median value (remember when dealing with an even number of
values the median equals the average of the two middle values, in
this case 5 and 6). The minimum and maximum values are 2 and 9 respectively,
giving us a range of 7. This is considered an even distribution because
the mean and median are located in the middle of the range. In other
words the minimum and maximum value are equal distances from the median
value. The box plot for this data has the median line in the middle
of the box and the minimum and maximum lines extend equal distances
from the box. 

Example 2 contains the following
values; 2 3 4 5 6 7 10 and 13 (Figure 9). The average equals 6.25
and the median is 5.5. Minimum and maximum values are 2 and 13, giving
us a range of 11. Having an average that is larger than the median
suggests that the data is not balanced around the median. Review of
the data shows that the two high values (10 and 13) deviate from the
median more than the two low values (2 and 3). This leads to data
referred to as skewed. When we compare the box plot for Example 2
to the box plot for Example 1 we see two indications that the data
was skewed: 1) the vertical line identifying the maximum is longer
than the minimum line, and 2) the box extends higher above the median
line than it does below. 



In Example 3 the values
are 2 3 4 4.25 4.75 7 8 and 9 (Figure 9). The average is 5.25 and
the median is 4.5. The range is 7 with minimum and maximum values
of 2 and 9 respectively. We again have skewed data as the median and
average are not equal. This time instead of extreme high values, the
skewing is caused by the values of 4, 4.25 and 4.75 being clumped
close together. Comparison of this box plot to Example 1 shows that
the plots are the same with the exception of the median line. In Example
3 we are tipped off to the skewness in the data by the fact that the
area of box above the median line is larger than the area below. 

Example 4 consist of
the following values; 2 3 3.25 3.50 3.75 4 5 and 9 (Figure 9). The
average for these data equals 4.19 and the median is 3.375. The minimum,
maximum and range are the same as seen in Examples 1 and 3. Again
we have skewed data caused by a clumping of low values (similar to
Example 3). This time the number of values clumped together is greater,
causing the box to be smaller. Clues that the data are skewed are
that the median line is not in the center of the box and the maximum
line extends farther away from the box than the minimum line.

For more about Box Plots, try this page:
http://www.lmvp.org/introduction/understanding.htm
