A simple Spotify scrobbler. Gets your listening history from Spotify, saves it to a database and creates a weekly backup in Google Drive.
- Free of charge* - uses the AWS free tier and the Google Drive API, which is free to use
- Utilities to get refresh tokens from both Google and Spotify
- Easily customizable
- Listening history export to Google Drive
Spotify's API only exposes the last 50 songs you've listened to.
This project seeks to provide an easy and free solution to saving your Spotify listening history in an accessible place (Google Drive) where you can retrieve and analyze it quickly.
Other than that, you can of course use everything here as a starting point/guideline to create something else with Spotify, AWS, Google Drive and Serverless.
-
This project makes use of two AWS Lambda functions, one for getting your history from Spotify and one for creating backups in Google Drive.
-
Unlike Last.FM, Spotify apparently counts as song as listened to when you listen to it for "over 30 seconds". The exact behaviour of how Spotify counts a song as listened to is not clear to me, but it seems like 30 seconds are the minimum.
-
By default, the history Lambda (scrobbler) is scheduled to get the history from Spotify at an hourly interval. With this interval, most "regular" users who listen through a song will have their full listening history captured. Assuming a very low average song duration of ~2 minutes would mean that one could listen to max. 30 songs per hour. As Spotify keeps track of the last 50 songs you've listened to, this interval would cover the entire hour. However, you may change the schedule.
-
By default, the backup Lambda is scheduled to run weekly at the start of the week (Monday at 12:30 a.m.). A week is defined according to the ISO 8610 standard and thus starts on Monday.
-
By default, items in the database expire after 1 month since they have already been backed up and are not needed anymore.
-
You might want to adjust the region in
serverless.yml - provider.region
if you don't live near Frankfurt (default iseu-central-1
). Available regions.
You can customize the backup, schedules, item expiration and much more. Customization guide.
- An AWS account
- A Spotify account
serverless >= 3
node >= v14.17.4
- Docker (optional)
- Fork and/or clone this repository and install dependencies:
git clone [email protected]:eegli/spotify-history.git
cd spotify-history
yarn
- Spotify setup - Create a Spotify application - app status "development" is fine - and set the redirect URL to
http://localhost:3000
. - In the root directory, create a folder named
.secrets
(notice the dot!) - Create a file named
credentials_spotify.json
and copy the template below. Insert your client id and client secret. Your Spotify secrets file should look like this:
{
"clientId": "<your-client-id>",
"clientSecret": "<your-client-secret>"
}
- Google Drive setup - Follow the quickstart guide to create a Google Cloud project and enable the Drive API. When asked to configure the consent screen, your publishing status should be testing. You will need to manually add the Google account who's drive you want to use under "Test users". In the end, you should be prompted to download your OAuth client credentials for your newly created desktop client as a JSON file.
- Download the credentials file, rename it to
credentials_google.json
and put it in the.secrets
folder. It should look like this:
{
"installed": {
"client_id": "blablabla",
"project_id": "spotify-history-32as4",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_secret": "blablabla",
"redirect_uris": ["urn:ietf:wg:oauth:2.0:oob", "http://localhost"]
}
}
Almost done!
- Run the following command and follow the steps. This will create a
token_spotify.json
file in the.secrets
folder containing your long-lived Spotify refresh token. KEEP THIS FILE SECURE!
npm run token:spotify
- Run the following command and follow the steps. This will create a
token_google.json
file in the.secrets
folder containing your long-lived Google Drive refresh token. KEEP THIS FILE SECURE!
npm run token:google
- Done!
This project includes both a staging and production environment. By default, the schedules are only enabled in production in order to save quota. The staging version is meant to be deployed but invoked manually only.
If you wish to enable the schedule on staging as well, change serverless.yml
:
custom:
scheduleEnabled:
prod: true
stg: true # Schedule enabled on staging
Keep in mind that this will double the calls made to Lambda and DynamoDB!
In order to deploy the production version, run:
npm run prod:deploy
You can deploy the staging version as well:
# Deploy everything
npm run stg:deploy
# Deploy functions only
npm run stg:deploy:history
npm run stg:deploy:backup
Again, the staging functions are NOT scheduled by default as they are meant to be invoked manually:
# Get history from Spotify and save to DynamoDB
npm run stg:invoke:history
# Create backup in Google Drive
npm run stg:invoke:backup
To check the logs of your Lambda functions, either go to the AWS CloudWatch dashboard or retrieve them in your console.
Example: Getting the logs for production in the last 24h
sls logs -f spotify-history -s prod --startTime 1d
sls logs -f spotify-history-backup -s prod --startTime 1d
More info about logging with Serverless.
By default, these are the song properties that are saved to the database (and backup):
interface DynamoHistoryElement {
name: string;
id: string;
playedAt: string;
}
If you want to save other song properties, simply change this interface in src/config/types.ts
and TypeScript will show you where you'll need to make adjustments. Obviously, it makes sense to at least store the timestamp of when the song was played (playedAt
) and its id (id
).
By default, items in DynamoDB are set to expire after 1 month. If you wish to disable this, set the TTL specification in serverless.yml
to false
(or remove the implementation altogether for a cleaner codebase).
TimeToLiveSpecification:
AttributeName: 'expire_at'
Enabled: false
If you want to specify a different TTL, change the dynamoExpireAfter
default in src/config/defaults.ts
If you want to change the backup schedule, e.g. running it daily or monthly, you'll need to adjust the cron expression in serverles.yml
. Here are some resources regarding cron jobs.
If you change the backup schedule, you'll also need to change the time range of the backup and, most likely, the item TTL as shown above.
// Example: Include history from last month
const defaults: Readonly<Defaults> = {
dynamoExpireAfter: [2, 'months'], // Extend the expiration date
backupRange: [1, 'month'], // Extend the backup range
...
};
Update the stage and production folder names in src/config/defaults.ts
.
Note that, for security reasons, the backup handler only has access to folders and files it has created itself (see OAuth 2.0 scopes and scripts/google.ts
). For simplicity, the backup folder is created at the root of your Google Drive.
For local development and testing the db integration, AWS's official DynamoDB Docker image can be run along with another image that provides a nice GUI for inspecting the tables and items.
- Start the containers (DynamoDB and GUI):
npm run dynamo:start
- Migrate the table and seed:
npm run dynamo:migrate
-
If you want to check if everything has been setup correctly, visit http://localhost:8001/
-
Invoke locally:
Note that
npm run local:backup
will, despite its naming, still hit the Google Drive API but save the content in a folder separate fromstg
andprod
(local
).
# Gets the history and saves it to local DynamoDB
npm run local:history
# Backs up the history to Google Drive
npm run local:backup
The core of this project uses AWS DynamoDB Data Mapper. Unfortunately, this package does not seem to be actively maintaned and is only compatible with the AWS SDK v2. By default, the AWS SDK v2 is included in the Lambda runtime environment, but not the modular version 3. Those are the reasons why it is currently not possible to upgrade this project to use the modular AWS SDK.
- Mocking TS method overloads with Jest
- Amazon DynamoDB DataMapper For JavaScript
- Amazon DynamoDB DataMapper Annotations
- Using the DynamoDB Document Client
- Serverless DynamoDB Local
- TypeScript: adjusting types in reduce function with an async callback
* Serverless uses S3 to store the code of the deployed functions. Technically, S3 is not free. It costs a fraction of a $ per GB, but a deployment takes up so little space, you most likely won't be billed. A full month of testing "cost" me 0.01$ and I was not billed. Be aware that, if you change the schedules, this project may not be "free" anymore!