Explore the Scientific R&D Platform

Try Now
Pricing
MENU
Try Now
Pricing

Missing Data - A Pervasive Problem in Data Analysis

July 23, 2015

Missing data are a pervasive problem in data analysis. Missing values lead to less efficient estimates because of the reduced size of the database, also standard complete-data methods of analysis no longer apply. For example, analyses such as multiple regression use only cases that have complete data, so including a variable with numerous missing values would severely reduce the sample size.

When cases are deleted if one or more variables have missing values, the number of remaining cases can be small even if the missing data rate is small for each variable.

missing-data-messes-with-your-vision

For example, suppose your data set has 5 variables measured at the start of study and monthly for six months. You have been told, with great pride, that each variable is 95% complete. If each of these 5 variables has a random 5% of the values missing, then the proportion of cases that are expected to be complete are 1-(.95)^35=0.834. That is, only 17% of the cases would be complete and you would lose 83% of your data.

Missing data also cause difficulties in performing Intent-to-Treat analyses in randomized experiments. Intent-to-Treat (IT) analysis dictates that all cases - complete and incomplete, be included in any analyses. Biases may exist from the analysis of only complete cases if there are systematic differences between completers and dropouts. To select a valid approach for imputing missing data values for any particular variable, it is necessary to consider the underlying mechanism accounting for missing data. Variables in a data set may have values that are missing for different reasons.

A laboratory value might be missing because:

  1. It was below the level of detection.
  2. The assay was not done because the patient did not come in for a scheduled visit.
  3. The assay was not done because the test tube was dropped or lost.
  4. The assay was not done because the patient died, or was lost to follow-up, or other possible cause

 

If missing data is a problem for you, check our white paper discussing the impact of missing data in the ATLAS ACS 2-TIMI 51 Trial.

Download White Paper 

Subscribe by Email

No Comments Yet

Let us know what you think