Song Recommendation System

Music Recommendation System Using Python

Team Members: Pragya Raghuvanshi, Sukhpreet Sahota, Dingkun Yang

Introduction

This Github Repo supports a song recommendation system that is built by data pulled from Kaggle via Spotify API. The recommendation system enables users to get suggestions based on Popularity of the music/song

Dataset

The dataset used for this project is from Kaggle and can be found here: Spotify dataset extracted by VATSAL MAVANI and posted on Kaggle, link: Dataset

Some features of the data:

Songs releasing years from 1921 - 2020

Songs Count: 133,638

Genres Count: 2,973

Approach

To create this recommendation system, we divided the project into 4 key steps:

Data Cleaning
Exploratory Data Analysis (EDA)
Building out a song popularity recommender
Incorporating song features recommender into the song feature recommender
Testing all components

Below is an illustrative flow chart outlining these 4 steps:

Installation

To replicate the results, please fork this git repository or clone it using git clone [email protected]:Yer1k/Song_Recommender.git

Upon doing so, please ensure all test files and source/code files are within the same directory. Please move all test files into your tests/directory if running this code locally. Please also install all libraries outlined within the requirements.txt file. In addition to the libraries specified within the requirements file, please also pip install pytest before importing/executing the test files.

Test Case Examples

test_song_class and test_parse_data Example

Start by creating a set of songs before passing the Song class to ensure the song attributes are returned correctly. In testing the parse_data function, after creating these structures, pass this through the parse_data function and assert to make sure the song file returns are the same as dictionaries created when using the data.txt file.

Example usage:

from parse_data import (
   parse_data,
)
from fake_files import fake_files


def test_song_class() -> None:
    """Test Song class."""
    song = Song(
        "4BJqT0PrAfrxzMOxytFOIz",
        "Piano Concerto No. 3 in D Minor",
        "Sergei Rachmaninoff & James Levine & Berliner Philharmoniker",
        "1921",
        "4",
    )
    assert song.song_id == "4BJqT0PrAfrxzMOxytFOIz"
    assert song.song_name == "Piano Concerto No. 3 in D Minor"
    assert (
        song.artist_name
        == "Sergei Rachmaninoff & James Levine & Berliner Philharmoniker"
    )
    assert song.year == "1921"
    assert song.popularity == "4"
    assert song.__repr__() == (
        "Song Name: Piano Concerto No. 3 in D Minor "
        + "by Sergei Rachmaninoff & James Levine & Berliner Philharmoniker, "
        + "Year: 1921"
    )


def test_parse_data() -> None:
    """Test parse_data function."""
    with fake_files(
        [
            [
               "id",
               "name",
               "artists",
               "year",
               "popularity",
           ],
           [
               "4BJqT0PrAfrxzMOxytFOIz",
               "Piano Concerto No. 3 in D Minor",
               "Sergei Rachmaninoff & James Levine & Berliner Philharmoniker",
               "1921",
               "4",
           ],
       ]
   ) as (song_file,):
       song_dict = parse_data(song_file)
       assert (
           song_dict["4BJqT0PrAfrxzMOxytFOIz"].song_id
           == "4BJqT0PrAfrxzMOxytFOIz"
       )
       assert song_dict["4BJqT0PrAfrxzMOxytFOIz"].song_name == (
           "Piano Concerto No. 3 in D Minor"
       )
       assert (
           song_dict["4BJqT0PrAfrxzMOxytFOIz"].artist_name
           == "Sergei Rachmaninoff & James Levine & Berliner Philharmoniker"
       )
       assert song_dict["4BJqT0PrAfrxzMOxytFOIz"].year == "1921"
       assert song_dict["4BJqT0PrAfrxzMOxytFOIz"].popularity == "4"

test_popularity_recommender Example

For testing the popularity recommender, we are analyzing the dictionary created after fake files develops the associated list of artists and songs. The assert statements ensure that the average popularity is returned and the number of songs provided back to the users are based on their preferred number of songs. Additionally, we want to make sure the user gets the summary stats on the number of songs and artists within a given file.

Example usage:

from fake_files import fake_files
from popularity_recommender import *


def test_calculate_artist_avg_popularity() -> None:
   # set up
   """Test parse_data function."""
   with fake_files(
       [
           [
               "id",
               "name",
               "artists",
               "year",
               "popularity",
           ],
           [
               "4BJqT0PrAfrxzMOxytFOIz",
               "Pragya is a genius",
               "Dingkun Yang",
               "1921",
               "4",
           ],...
       ]
   ) as (song_file,):
       # run
       s = SongRecommendationSystem(song_file)

       # assert
       assert s.calculate_artist_avg_popularity() == {
           "Sergei Rachmaninoff": 5,
           "James Levine": 5,
           "Berliner Philharmoniker": 5,
           "Dingkun Yang": 3,
           "Pragya R": 3,
           "Sukhpreet S": 6,
       }
       assert s.recommend_songs(2) == [
           "Dingkun is a genius",
           "Piano Concerto No. 3 in C Major",
       ]
       assert (
           s.__str__()
           == "SongRecommendationSystem with 5 songs and 6 artists."
       )

Future Work

As we look to build out this song recommendation system and make it more comprehensive, we will incorporate the following recommendation suggestion abilities into the system: By Artist, and By Distinguishing musical characteristics of the song

To build on our exisiting recommendation system, we will incorporate the following 4 steps into this project:

Building out a song similarity recommender based on artist
Testing all components
Combining into a single song recommendation system

Below is an illustrative flow chart depicting how our future work will be integrating into our existing system:

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
data		data
eda		eda
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements-test.txt		requirements-test.txt
requirements.txt		requirements.txt
test_report.txt		test_report.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Song Recommendation System

Team Members: Pragya Raghuvanshi, Sukhpreet Sahota, Dingkun Yang

Introduction

Dataset

Approach

Installation

Test Case Examples

Future Work

About

Releases

Packages

Contributors 3

Languages

License

Yer1k/Song_Recommender

Folders and files

Latest commit

History

Repository files navigation

Song Recommendation System

Team Members: Pragya Raghuvanshi, Sukhpreet Sahota, Dingkun Yang

Introduction

Dataset

Approach

Installation

Test Case Examples

Future Work

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages