Comparing Ratings by Gender and Plotting Them
Learn how to filter the movie lens database by gender.
We'll cover the following...
Let’s see which movies are top-rated by females in the movie lens database.
Look at the highlighted line 26 of the code widget below. We’re sorting ratings_by_gender in the 'F’ column, which stands for female:
Python 3.8
import pandas as pdimport matplotlib.pyplot as pltdef movie(nogui=False, movielenspath=''):user_columns = ['user_id', 'age', 'gender']users = pd.read_csv(movielenspath + 'movie_lens/u.user', sep='|', names=user_columns, usecols=range(3))rating_columns = ['user_id', 'movie_id', 'rating']ratings = pd.read_csv(movielenspath + 'movie_lens/u.data', sep='\t', names=rating_columns, usecols=range(3))movie_columns = ['movie_id', 'title']movies = pd.read_csv(movielenspath + 'movie_lens/u.item', sep='|', names=movie_columns, usecols=range(2), encoding="iso-8859-1")# create one merged DataFramemovie_ratings = pd.merge(movies, ratings)movie_data = pd.merge(movie_ratings, users)ratings_by_title = movie_data.groupby('title').size()popular_movies = ratings_by_title.index[ratings_by_title >= 250]ratings_by_gender = movie_data.pivot_table('rating', index='title',columns='gender')ratings_by_gender = ratings_by_gender.loc[popular_movies]top_movies_individuals_tagged_as_female = ratings_by_gender.sort_values(by='F', ascending=False)print("Top rated movies by individuals_tagged_as_female \n", top_movies_individuals_tagged_as_female.head())print("\n")if __name__ == "__main__":movie()
As we can see, most ...
Ask