Search⌘ K
AI Features

DataFrame

Explore the fundamentals of Pandas DataFrames including creation, indexing, selecting rows and columns, adding and deleting columns, and conditional filtering. This lesson helps you efficiently manage and analyze structured data using key DataFrame methods.

A very simple way to think about a DataFrame is as a bunch of Series that share the same index. A DataFrame is a rectangular table of data that contains an ordered collection of columns, each of which can be a different value type (numeric, string, boolean, and so on). A DataFrame has both row and column indexes. It can be thought of as being a dictionary of Series, all of which share the same index (any row or column).

Let’s create a few DataFrames to learn more about them.

For our DataFrame, we’ll create two labels or indexes:

  • Our index will be for rows r1 to r10.
  • Our columns will be for columns c1 to c10.

In the code below, we’ll use split() to create a list and then use arange() and reshape() together to create a 2D array (matrix).

Python 3.5
import pandas as pd
import numpy as np
index = 'r1 r2 r3 r4 r5 r6 r7 r8 r9 r10'.split()
columns = 'c1 c2 c3 c4 c5 c6 c7 c8 c9 c10'.split()
# just to see what the index looks like, a list from r1 to r10!
print(index)
# and what columns look like, a list from c1 to c10!
print(columns)
array_2d = np.arange(0,100).reshape(10,10) # creating a 2D array "array_2d"
print(array_2d)

Now, let’s create our first DataFrame using index, columns, and array_2d.

Python 3.5
df = pd.DataFrame(data = array_2d, index = index, columns = columns)
print(df)

Our first data frame is df. We have columns c1 to c10 and their corresponding rows r1 to r10. Each column is actually a pandas Series, sharing a common index of row labels.

Use df to access and manipulate data, a core concept in this course.

Columns

Grabbing columns from DataFrame

To grab a column from a DataFrame, we simply pass the name of the required column in square brackets.

Python 3.5
# Grabbing a single column
print("Column C1 : ", df['c1'])

The output is a Series. The returned Series shares ...