In this project we will gather statistics from LastFM and Spotify to analyse and visualise the most influencial Heavy Metal albums in history.
This projects aims to demonstrate good software engineering practices (versioning, testing, refactoring).
It is possible to install and run our code both with and without a conda environment. See both options below.
To use with conda simply run the following commands. An environment will be created with the name spotify
, so make sure that there does not already exist an environment with that name.
git clone https://github.com/ostromann/heavy_metal_history.git
cd heavy_metal_history
conda env create -f environment.yml
conda activate metalhistory
If you want to install the repository not in a Conda-based environment, simply clone the repository, then run the following commands:
cd heavy_metal_history
pip3 install -r requirements.txt
Note to run
python3 -m venv /path/to/new/virtual/environment
if you want to create a virtual environment before installing the required packages.
The code in this repository can be devided into two groups of functions, namely data retrieval/pre-processing and visualization. We provide two jupyter notebooks to demonstrate the usage of these functions.
To get familiar with the data retrieval/pre-processing, run:
jupyter notebook 0-preprocessing.ipynb
This uses the functions in metalhistory/data_query_functions.py
to create a csv file of pre-processed data. We do not recommend running this for a large number of album entries as it takes a long time. We have instead included the pre-processed csv file in the repository.
To see how to do some visualizations, run:
jupyter notebook 1-visualizations.ipynb
This notebook uses the functions in metalhistory/visualization_api.py
to visualize the pre-processed data in different ways.
It is also possible to view these notebooks in a browser by navigating to e.g. 1-visualizations.ipynb.
Run test routines with:
pytest -s
from the root directory. This command will execute all the test routines contained in the tests
folder.
The -s
option will output to screen any print()
statement. To run singular test routines, execute:
pytest -s metalhistory/tests/test_query_api.py
to test the data query functions, and
pytest -s metalhistory/tests/test_visualization_api.py
to test the visualization functions.
We use the following folder structure in this project:
heavy_metal_history
, root of repository (includes notebooks)data
, raw and pre-processed dataimages
, generated visualizations and other imagesmetalhistory
, source codetests
In general we follow the Conventional Commits 1.0.0. We use the following commit types: feat, fix, docs, test
We draw some inspiration from Gitflow and use two permanent branches, namely master and stable. For each new feature we create a new temporary branch of master named type/scope where type is one of feature/fix/doc or similar, and scope is a brief name for the feature. The name is written in small letters and words are separated by hyphen (-). An example branch name is feature/word-cloud-visualization.
When enough features have been implemented for a release, we merge the master branch with the stable branch and increment the release version.
Keep track of this project's development on this Trello board.
Currently we have two lists of input data:
./data/artists_unfiltered.csv which contains a list of artists that have released at least 1 album that could be tagged as a subgenre of metal (See What counts as Heavy Metal?). This means that album tags should be checked before including all albums of an artist.
./data/MA_10k_albums.csv which contains the a list of 10,000 albums and their respective artists that received the highest Metascores on Encyclopedia Metallum: The Metal Archives.
Additionally, we have one preprocessed dataset that is ready for data analysis and visualizations:
./data/proc_MA_1k_albums.csv contains the first 1,000 albums of ./data/MA_10k_albums.csv with added information like listeners, playcounts, tags, urls, images etc.
The data will be collected using Spotify's Web API and LastFM's Web API.
The following features and limitations were already identified in the two APIs:
Spotify's Web API:
- doesn't show playcounts
- release years are often wrong (due to re-masters, special editions etc.)
- gives only a popularity score measured against the most popular artists in general
LastFM's API:
- shows playcounts and number of listeners
- many different versions of the same album appear
- release years are often wrong, too
To get the right release years we'd perhaps need to use another API (Wikipedia?) or use some other approach (like take the lowest ever mentioned release year of an album on Last FM)
Right now you will need a personal API key for both Spotify's Web API and LastFM's Web API. Both are free but require registration (see Spotify Authentication and LastFM Authentication).
To give a broad overview of the genre all of the following subgenres of Heavy Metal are considered (taken from Wikipedia and extended by some sub Wikipedia sites. Can still be extended):
Alternative metal
- Funk metal
- Nu metal
- Rap metal
- Avant-garde metal
Black metal
- Ambient black metal
- Blackened heavy metal
- Blackened screamo
- Blackgaze
- Black'n'Roll
- Depressive suicidal black metal
- NSBM
- Post-black metal
- Red and Anarchist black metal
- Symphonic black metal
- Viking metal
- War metal
Christian metal
- Unblack metal
Crust punk
- Blackened crust
Death metal
- Blackened death metal
- Death 'n' roll
- Melodic death metal
- Technical death metal
- Symphonic death metal
Doom metal
- Death-doom
- Drone metal
- Funeral doom
- Sludge metal
- Stoner metal
- Extreme metal
Folk metal
- Celtic metal
- Pirate metal
- Pagan metal
Glam metal
- Hair metal
- Pop metal
- Gothic metal
Grindcore
- Deathgrind
- Goregrind
- Pornogrind
- Electrogrind
- Grunge
Industrial metal
- Industrial death metal
- Industrial black metal
- Kawaii metal
- Latin metal
Metalcore
- Melodic metal
- Deathcore
- Mathcore
- Electronicore
- Synthcore
- Trancecore
- Nu metalcore
- Nu metal revival
- New nu metal
- Progressive metalcore
- Technical metalcore
- Ambient metalcore
- Neoclassical metal / Shred metal
- Neue Deutsche Härte
- Post-metal
- Power metal
Progressive metal
- Djent
- Space metal
- Speed metal
- Symphonic metal
Thrash metal
- Crossover thrash
- Groove metal
- Teutonic thrash metal
Traditional heavy metal
- New wave of British heavy metal (NWOBHM)
- New wave of American heavy metal (NWOAHM)
- New wave of traditional heavy metal (NWOTHM)