Search⌘ K
AI Features

Challenge 1: Sum of Plays per Country (Trivial)

Explore how to use Pandas grouping to aggregate data by country. Learn to apply groupby and sum functions to calculate total plays per country and return results as a dictionary.

Problem definition

Your music analyst is interested in finding the sum of plays of artists by country. Can you return the total number of plays for each country?

Expected output

A Python dictionary where every key is the country name, and the value is the sum of plays of artists originating from that country

Example: {'US': <x>, 'Finland': <y>, 'UK': <z>, 'Egypt': <a>}

Challenge

Python 3.5

Solution

Python
import pandas as pd
def test():
df = pd.read_csv('music.csv')
return df.groupby('country').plays.sum().to_dict()
print(test())

Solution explanation

In this lesson, all problems will get much easier to solve by utilizing the df.groupby() function.

Since you are looking to apply an aggregate function and get an output per group, where every group corresponds to a country, the first step is to group by the country field through: df.groupby('country').

The next step is accessing the plays column and applying the sum function, which will indicate that you want the sum of plays per group (i.e., country). So the expression till now would be df.groupby('country').plays.sum().

The last step in .to_dict() will just return the DataFrame in a dictionary format. Since you had grouped by the country and applied a function to the plays column, the result would be a DataFrame, with the index being the country names, and the value being the sum of plays. This results in your expected format: {'US': <x>, 'Finland': <y>, 'UK': <z>, 'Egypt': <a>}