Author: Waldy Setiono ([email protected])
Recommendation systems are widely used in our daily lives and to some extent play a significant role in shaping the desicions we make. Almost everything we buy, watch, consume, use, or even do is influenced by some form of recommendation, be it from friends, google search, family, shaman, preacher, political leader, advisor, lawyer, doctor, scholar, online reviews, app algorithm, and so on. Big companies gain substantial revenue growth by implementing recommender engine to their platforms.
Recommendation systems can be built using:
- Content-based Filtering,
- Collaborative Filtering, or
- Combination of both (hybrid)
While content-based filtering attempts to guess what users may like based on their own activities, collaborative filtering tries to predict what a user might like based on other users that have similarity with the user in question. Collaborative filtering can be memory-based or model-based.
This project aims to develop an end-to-end recommendation system that can suggest someone some movies that she/he might like using model-based colaborative filtering.
# Import packages.
import pandas as pd
import numpy as np
from io import BytesIO
from urllib.request import urlopen
from zipfile import ZipFile
import os
import platform
import pprint
from typing import Dict, Text
%matplotlib inline
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.decomposition import TruncatedSVD
Data: The data used in this project is from GroupLens, a research lab at the University of of Minnesota. This dataset contains over 100,000 ratings applied to 9,000 movies by 600 users.
# Load dataset.
zipurl = "https://files.grouplens.org/datasets/movielens/ml-latest-small.zip"
with urlopen(zipurl) as zipresp:
with ZipFile(BytesIO(zipresp.read())) as zfile:
zfile.extractall("/tmp/movielens")
Titles and Genres
# Check movie titles.
movies = pd.read_csv('/tmp/movielens/ml-latest-small/movies.csv')
movies
movieId | title | genres | |
---|---|---|---|
0 | 1 | Toy Story (1995) | Adventure|Animation|Children|Comedy|Fantasy |
1 | 2 | Jumanji (1995) | Adventure|Children|Fantasy |
2 | 3 | Grumpier Old Men (1995) | Comedy|Romance |
3 | 4 | Waiting to Exhale (1995) | Comedy|Drama|Romance |
4 | 5 | Father of the Bride Part II (1995) | Comedy |
... | ... | ... | ... |
9737 | 193581 | Black Butler: Book of the Atlantic (2017) | Action|Animation|Comedy|Fantasy |
9738 | 193583 | No Game No Life: Zero (2017) | Animation|Comedy|Fantasy |
9739 | 193585 | Flint (2017) | Drama |
9740 | 193587 | Bungo Stray Dogs: Dead Apple (2018) | Action|Animation |
9741 | 193609 | Andrew Dice Clay: Dice Rules (1991) | Comedy |
9742 rows × 3 columns
# Print how many unique values of each column.
print("There are ", movies.movieId.nunique(), "unique values in movieID.")
print("There are ", movies.title.nunique(), "unique values in title.")
print("There are ", movies.genres.nunique(), "unique values in genres.")
There are 9742 unique values in movieID.
There are 9737 unique values in title.
There are 951 unique values in genres.
Ratings
# Make a dataframe of ratings.
ratings = pd.read_csv('/tmp/movielens/ml-latest-small/ratings.csv')
ratings
userId | movieId | rating | timestamp | |
---|---|---|---|---|
0 | 1 | 1 | 4.0 | 964982703 |
1 | 1 | 3 | 4.0 | 964981247 |
2 | 1 | 6 | 4.0 | 964982224 |
3 | 1 | 47 | 5.0 | 964983815 |
4 | 1 | 50 | 5.0 | 964982931 |
... | ... | ... | ... | ... |
100831 | 610 | 166534 | 4.0 | 1493848402 |
100832 | 610 | 168248 | 5.0 | 1493850091 |
100833 | 610 | 168250 | 5.0 | 1494273047 |
100834 | 610 | 168252 | 5.0 | 1493846352 |
100835 | 610 | 170875 | 3.0 | 1493846415 |
100836 rows × 4 columns
# Drop timestamp from the dataframe.
ratings = ratings.drop(columns=["timestamp"])
ratings
userId | movieId | rating | |
---|---|---|---|
0 | 1 | 1 | 4.0 |
1 | 1 | 3 | 4.0 |
2 | 1 | 6 | 4.0 |
3 | 1 | 47 | 5.0 |
4 | 1 | 50 | 5.0 |
... | ... | ... | ... |
100831 | 610 | 166534 | 4.0 |
100832 | 610 | 168248 | 5.0 |
100833 | 610 | 168250 | 5.0 |
100834 | 610 | 168252 | 5.0 |
100835 | 610 | 170875 | 3.0 |
100836 rows × 3 columns
# Print how many unique values of each column.
print("There are ", ratings.userId.nunique(), "unique values in userID.")
print("There are ", ratings.movieId.nunique(), "unique values in movieID.")
print("There are ", ratings.rating.nunique(), "unique values in rating.")
There are 610 unique values in userID.
There are 9724 unique values in movieID.
There are 10 unique values in rating.
# Check missing values.
null_data = ratings[ratings.isnull().any(axis=1)]
null_data
userId | movieId | rating |
---|
It seems there is no missing value in the dataframe.
Movies and Ratings
# Merge movies and ratings.
movies_ratings = pd.merge(ratings, movies, on='movieId')
movies_ratings = movies_ratings.drop(columns=["movieId", "genres"])
movies_ratings
userId | rating | title | |
---|---|---|---|
0 | 1 | 4.0 | Toy Story (1995) |
1 | 5 | 4.0 | Toy Story (1995) |
2 | 7 | 4.5 | Toy Story (1995) |
3 | 15 | 2.5 | Toy Story (1995) |
4 | 17 | 4.5 | Toy Story (1995) |
... | ... | ... | ... |
100831 | 610 | 2.5 | Bloodmoon (1997) |
100832 | 610 | 4.5 | Sympathy for the Underdog (1971) |
100833 | 610 | 3.0 | Hazard (2005) |
100834 | 610 | 3.5 | Blair Witch (2016) |
100835 | 610 | 3.5 | 31 (2016) |
100836 rows × 3 columns
Popularity-based Recommender
One of the simplest movie recommender systems is popularity-based recommender. This can be done for example by suggesting Top 10 of the most rated movies.
# Recommend movies based on rating counts.
rating_count = pd.DataFrame(movies_ratings.groupby("title")["rating"].count())
rating_count.sort_values("rating", ascending=False).head(10)
rating | |
---|---|
title | |
Forrest Gump (1994) | 329 |
Shawshank Redemption, The (1994) | 317 |
Pulp Fiction (1994) | 307 |
Silence of the Lambs, The (1991) | 279 |
Matrix, The (1999) | 278 |
Star Wars: Episode IV - A New Hope (1977) | 251 |
Jurassic Park (1993) | 238 |
Braveheart (1995) | 237 |
Terminator 2: Judgment Day (1991) | 224 |
Schindler's List (1993) | 220 |
Utility Matrix
In order to make a recommendation system based on collaborative filtering, let's make a utility matrix containing user ID, movie ID, and how the users rate the movies using pivot table.
# Create utility matrix using pivot table.
X = movies_ratings.pivot_table(values='rating', index='title', columns='userId').fillna(0)
X
userId | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | ... | 571 | 572 | 573 | 574 | 575 | 576 | 577 | 578 | 579 | 580 | 581 | 582 | 583 | 584 | 585 | 586 | 587 | 588 | 589 | 590 | 591 | 592 | 593 | 594 | 595 | 596 | 597 | 598 | 599 | 600 | 601 | 602 | 603 | 604 | 605 | 606 | 607 | 608 | 609 | 610 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
title | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
'71 (2014) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 |
'Hellboy': The Seeds of Creation (2004) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
'Round Midnight (1986) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
'Salem's Lot (2004) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
'Til There Was You (1997) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
eXistenZ (1999) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.5 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.5 | 0.0 | 0.0 |
xXx (2002) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.5 | 0.0 | 2.0 |
xXx: State of the Union (2005) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.5 |
¡Three Amigos! (1986) | 4.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 5.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.0 | 0.0 | 2.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
À nous la liberté (Freedom for Us) (1931) | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
9719 rows × 610 columns
# Decompose utility matrix using Truncated SVD
svd = TruncatedSVD(n_components=12, random_state=17)
decomposed_matrix = svd.fit_transform(X)
# Check the resultant matrix shape
decomposed_matrix.shape
(9719, 12)
print(svd.explained_variance_ratio_)
[0.17452408 0.04189715 0.02633773 0.02137632 0.0185918 0.01612086
0.0143732 0.01178569 0.01147853 0.0099213 0.00934755 0.00905726]
print(svd.explained_variance_ratio_.sum())
0.3648114678094881
# Generate correlation matrix
corr_matrix = np.corrcoef(decomposed_matrix)
print(corr_matrix.shape)
corr_matrix
(9719, 9719)
array([[ 1. , 0.20967451, 0.30277437, ..., 0.79074266,
-0.09266651, -0.11632059],
[ 0.20967451, 1. , 0.93621217, ..., 0.11127732,
0.03997583, -0.24647969],
[ 0.30277437, 0.93621217, 1. , ..., 0.10717506,
0.19895528, 0.01216579],
...,
[ 0.79074266, 0.11127732, 0.10717506, ..., 1. ,
-0.11547412, -0.11670845],
[-0.09266651, 0.03997583, 0.19895528, ..., -0.11547412,
1. , 0.32751487],
[-0.11632059, -0.24647969, 0.01216579, ..., -0.11670845,
0.32751487, 1. ]])
# Create list of movies names
movies_names = X.index
movies_list = list(movies_names)
Suppose we want to recommend movies similar to Spider-Man.
# Find a movie on which our recommendation based
basis_movie = movies_names.str.contains('Spider', regex=False)
for x in range(len(basis)):
if basis_movie[x] == True:
print(movies_names[x])
Along Came a Spider (2001)
Amazing Spider-Man, The (2012)
Giant Spider Invasion, The (1975)
Horrors of Spider Island (Ein Toter Hing im Netz) (1960)
Kiss of the Spider Woman (1985)
Spider (2002)
Spider-Man (2002)
Spider-Man 2 (2004)
Spider-Man 3 (2007)
Spiderwick Chronicles, The (2008)
The Amazing Spider-Man 2 (2014)
Untitled Spider-Man Reboot (2017)
# Isolate basis movie from the correlation matrix
basis_index = movies_list.index('Spider-Man (2002)')
print(basis_index)
7921
Pearson Correlation Coefficient
# Calculate the correlation
corr_similar_movies = corr_matrix[basis_index]
corr_similar_movies
array([0.216504 , 0.56198288, 0.53920046, ..., 0.44844531, 0.50734146,
0.0592397 ])
Recommend highly correlated movies
list(movies_names[(corr_similar_movies < 1) & (corr_similar_movies > 0.9)])
['A.I. Artificial Intelligence (2001)',
'Armageddon (1998)',
'Back to the Future Part II (1989)',
'Back to the Future Part III (1990)',
'Batman Begins (2005)',
'Big Fish (2003)',
'Bourne Identity, The (2002)',
'Bourne Supremacy, The (2004)',
'Cast Away (2000)',
'Catch Me If You Can (2002)',
"Charlie's Angels (2000)",
'Chicken Run (2000)',
'Chronicles of Narnia: The Lion, the Witch and the Wardrobe, The (2005)',
'Crouching Tiger, Hidden Dragon (Wo hu cang long) (2000)',
'Fifth Element, The (1997)',
'Gladiator (2000)',
'Hero (Ying xiong) (2002)',
'House of Flying Daggers (Shi mian mai fu) (2004)',
'Illusionist, The (2006)',
'Incredibles, The (2004)',
'Italian Job, The (2003)',
'K-PAX (2001)',
'Last Samurai, The (2003)',
'Lord of the Rings: The Fellowship of the Ring, The (2001)',
'Lord of the Rings: The Two Towers, The (2002)',
'Mask of Zorro, The (1998)',
'Matrix Reloaded, The (2003)',
'Matrix Revolutions, The (2003)',
'Minority Report (2002)',
'Monsters, Inc. (2001)',
"Ocean's Eleven (2001)",
'Pirates of the Caribbean: The Curse of the Black Pearl (2003)',
'Road to Perdition (2002)',
'School of Rock (2003)',
'Serenity (2005)',
'Shrek (2001)',
'Shrek 2 (2004)',
'Signs (2002)',
'Spider-Man (2002)',
'Spider-Man 2 (2004)',
'Star Wars: Episode I - The Phantom Menace (1999)',
'Star Wars: Episode II - Attack of the Clones (2002)',
'Star Wars: Episode III - Revenge of the Sith (2005)',
'Truman Show, The (1998)',
'Unbreakable (2000)',
'WarGames (1983)',
'X-Men (2000)',
'X-Men: The Last Stand (2006)',
'X2: X-Men United (2003)']
-
Rajaraman, A., Ullman, J. D. (2014). Mining of massive datasets. Cambridge: Cambridge University Press.
-
Banik, R. (2018). Hands-on recommendation systems with python. Birmingham: Packt.
-
Department of Computer Science and Engineering, University of Minnesota. (2021). GroupLens. https://grouplens.org/datasets/movielens
-
Strang, G. (2016). Introduction to linear algebra. MA: Wellesley-Cambridge Press.
-
Felferning, A., et al. (2018). Group recommender systems. Springer.