Handling Missing Data

Interact with sample code to understand how data visualization can be used for missing data.

Missing data is present in many real-world datasets and is often handled by removing these data points or imputing them. Imputing is defined as replacing the data with estimated values. In this lesson, we'll learn how data storytellers handle missing data.

Why analyze missing data?

In some cases, missing data can be helpful to understand potential trends/insights that are not part of our dataset. Missing data can be caused due to several different factors, such as:

  • Erroneous reporting. For example, consider a digital surveillance camera that is damaged due to weather conditions and is consistently producing blurry footage, or a damaged temperature sensor on a manufacturing floor that is reporting incorrect measurements.

  • Participants who don't wish to provide certain data for survey/

Depending on the programming framework and libraries we are using, examples of types of formats of missing data include Nan, N/A, NA, 0 values, and more.

There are also types of missing data including:

  • Structurally missing data: The missing data is data that does not exist in the first place.

  • Missing completely at ...

Ask