Know Your Data

Learn where data comes from, what types it takes, and how it’s stored around us.

Before we start asking big questions, building models, or writing clever queries, we need to take a step back and ask something simpler: What is this data we’re working with? Where did it come from? How is it organized? These might seem like basic questions, but they’re the foundation of everything we do in data science.

Know the backstory

Every dataset has a backstory. It might be clean and well-organized, or it might be messy and inconsistent. Maybe it was collected through a web form, a sensor, or a survey. Each of these origins shapes what the data can tell us, and what it can’t.

Consider planning a trip. You wouldn’t just jump in the car and start driving without knowing where you’re going. In the same way, we need to understand the landscape of our data before we can confidently navigate it.

When we understand how and why the data was collected, why it’s structured the way it is, and what it’s meant to represent, we’re better equipped to use it wisely. We avoid faulty assumptions, ask smarter questions, and make decisions that more accurately reflect reality.

Now let’s take a closer look at the types of data we might work with. Each has its structure and challenges; recognizing these differences helps us choose the right tools and approaches.

Data types

In data science, not all data looks the same. Some datasets are highly organized and easy to explore. Others are more complex, less structured, and require a bit more work to make sense of. Knowing how to recognize different data types is foundational; it shapes how we clean, analyze, and model it.

Get hands-on with 1400+ tech skills courses.