Rename Attributes
Learn to rename attributes using pandas and PySpark.
We'll cover the following...
In an organization, data is generated from different sources, systems, and processes before it reaches us, so the column naming might surprise us. We’ll encounter different ways of column naming such ...
COL_NAME_MAP = {
"overall": "overall",
"verified": "verified",
"reviewTime": "review_time",
"reviewerID": "reviewer_id",
"asin": "asin",
"reviewerName": "reviewer_name",
"reviewText": "review_text",
"summary": "summary",
"unixReviewTime": "unix_review_time",
"style": "style",
"vote": "vote",
"image": "image"
}
print('Initial Columns names:')
i=1
for col_name in raw_pdf.columns:
print(f'{i}: {col_name}')
i=i+1
## renaming column names
raw_pdf = raw_pdf.rename(columns=COL_NAME_MAP)
print('___________________________')
print('Columns names after rename:')
i=1
for col_name in raw_pdf.columns:
print(f'{i}: {col_name}')
i=i+1
print('___________________________')
print('Code Executed Successfully')
Renaming columns in Pandas
After successful code execution, we’ll see the message ...