Survey Insights

Single vs Multiple Imputation

Imputation is one of the key strategies that researchers use to fill in missing data in a dataset. By using various calculations to find the most probable answer, imputed data is used in place of actu...

Introduction

Imputation is one of the key strategies that researchers use to fill in missing data in a dataset. By using various calculations to find the most probable answer, imputed data is used in place of actual data in order to allow for more accurate analyses. There are two different types of imputation: Single Imputation Multiple Imputation Single imputation involves less computation, and provides the dataset with a specific number in place of the missing data. While there is more than one type of single imputation, in general the process involves analyzing the other responses and looking for the most likely (or a set of the most likely) responses the individual would have answered, and then picks one of those possible responses at random and places it in the dataset. When only a little bit of data is missing, single imputation provides a useful enough tool. It fills in the data points well and the variance between the results of your analyses is unlikely to be altered by any significant margin. But when you are dealing with a considerable about of missing data, single imputation models cause a serious problem – once the number has been added to the dataset, it treats the number as an equal to the data that was not imputed, allowing for misleading analysis.

How Does Multiple Imputation Fix This?

Multiple imputation seeks to solve that problem. Multiple imputations use simulation models that take from a set of possible responses, and impute in succession to try to come up with a variance/confidence interval that one can use to better understand the differences between imputed datasets, depending on the numbers that the simulation chooses to use for the missing data. Recall that different types of computations are used to discover what data is most likely to have been placed in the missing responses, so studies where the results of the research are more uniform (where people with similar responses to the person with the missing data tended to change fairly evenly throughout), the imputed datasets should have much less variance. If, however, the results of those with similar attributes had varying responses themselves, then the imputed sets will likely vary as well.

Choosing Single or Multiple Imputation

The greatest drawback of multiple imputation is the complex nature of performing these imputations. You will need to be familiar with how to not only run analyses, but also combine the results as indicated here to use your data correctly. Similarly, if very little data is missing, single imputation may be simpler and solve the problem without any/many serious errors. But otherwise, multiple imputation seeks to introduce the variability of imputed data in order to find a range of possible responses from which to work from.

Key Takeaways

  • Introduction
  • How Does Multiple Imputation Fix This?
  • Choosing Single or Multiple Imputation

Ready to Get Started?

Create your first survey today with our easy-to-use platform.