Convert and Encode Data Types
Understand how to convert and encode data types appropriately.
Convert data types
There are four main ways we can convert the data type of DataFrame columns:
We can use the
astype()function to enforce apandasdtype.We can use the
to_functions liketo_datetime(),to_timedelta(), andto_numeric()for manual dtype conversion.We can use the
apply()function to apply the custom Python function to columns.We can use the
convert_dtypes()function for automatic dtype inference and conversion.
Lets illustrate these different ways by using a dataset of students’ demographics and academic performance.
# View students datasetprint(df)print('='*55)# Inspect data typesprint(df.dtypes)
There are a series of changes we need to make to this dataset:
Convert
student_idfrom float to the integer type.Convert
genderfrom object (aka string) to category type.Convert
enroll_yearfrom object to integer type.Convert
birthdatefrom object to datetime type.Convert
has_scholarshipfrom integer to ...