When it comes to analysis, you are often faced with choosing between many different tools. You have a large dataset filled with information, but if you use different analysis methods, you may get different results. One of the most common errors when it comes to survey research comes from this issue, and interestingly it is one of the most basic types of analysis: Mean vs. Median.
Mean vs. Median
Mean is simply another term for “Average.” It takes all of the numbers in the dataset, adds them together, and divides them by the total number of entries. Median, on the other hand, is the 50% point in the data, regardless of the rest of the data. For example, if you have the following data:
1, 1, 1, 1, 1, 1, 2, 2, 4
The median is just “1” since that is the middle number in the dataset, while the mean (average) is 1.56. For a lot of analysis, the mean is very useful. Indeed, if you’re trying to understand data that falls under a normal curve, the mean can tell you a lot of information, because it helps remove some statistical noise from the data and gives you an overall average score for the group.
But the mean is far too often overused, because when it comes to collecting data, it’s not uncommon to find that there are extreme scores that may be altering the final results of your analysis.
Example of When Median is More Useful
Let’s say you run a customer satisfaction survey with a sample of 9 and rate their overall satisfaction scores on a scale of 1 to 10. You get an average of 5.22. You know that in general, you tend to retain customers with a score over 3, so you’re satisfied, because this indicates that you’re still above where you want to be. But then, suddenly, you lose 6 of those 9 customers. You go back to look at your data, and you find these scores:
1, 3, 3, 3, 3, 5, 9, 10, 10
The median of this group is a 3, indicating that at least half of your customers or more were unhappy. The scores became lopsided because of the unexpected 10’s, and you missed out on an important part of your data – the midpoint that indicated that as many as half of your customers or more were dissatisfied with your company.
Median can play a major role in things like income level research as well, because a few millionaires may make it look like the socio-economic status of your sample is higher than it really is.
Whenever a graph falls on a normal distribution, using the mean is a good choice. But if your data has extreme scores (such as the difference between a millionaire and someone making 30,000 a year), you will need to look at median, because you’ll find a much more representative number for your sample.