Finding The Median Of A Data Set

Finding the median, or middle value, of a set of numbers provides a way to summarize the data into a typical value or summary statistic.

What is a median value?

The median value is the middle value in a set of values. Half of all values are smaller than the median value and half are larger. When the data set contains an odd (uneven) set of numbers, the middle value is the median value. When the data set contains an even set of numbers, the middle two numbers are added and the sum is divided by two. That number is the median value.

How do I calculate a median value?

In order to find the median, we list the numbers from smallest to largest. For example, in the data set {4, 8, 2, 10, 6}; we first order the numbers by increasing frequency {2, 4, 6, 8, 10}. Since there is an uneven set of numbers, we take the middle number. The third value of 6 falls in the middle of the array. Two numbers fall before it and two fall after it.

In the data set {2, 3, 4, 5}, the median value falls between the 2nd and 3rd value. We add these two values (3 + 4) and divide by 2. The median value of this data set is 3.5.



In both cases, the data set is divided so that half the observation lie in front of the median and half the observations fall behind it.

Why is it important to know the median value?

The median value is a measure of central tendency. It is a summary statistic that provides us with a description of the entire data set and is especially useful with large data sets where we might not have the time to examine every single value. The median describes the position of the value in the distribution. This is why average housing costs are usually described in terms of the median cost, since a few extremely high values can skew the distribution to the right.

What do I need to know about the median?

The median is a very commonly reported summary statistic. It is accurate whether the data set is normally distributed or if there are extreme values which skew the data. In a normal distribution, the mean, median, and mode are approximately equal to each other. In a skewed distribution, there are extreme values (outliers) at either the left or right side of the distribution. The median is less sensitive to extreme values than the mean. Therefore, it is a very good summary statistic to use when the data contains outliers.

The median is useful with ordinal, interval, and ratio data. It will always fall between the mode and the mean in a skewed distribution.

© High Speed Ventures 2011