A remote uptime monitoring framework for running monitors as a CRON job.
At Dollar Shave Club, we run some of our monitors using CircleCI 2 Scheduled Workflows. You can see the test/example monitors for this repository running every minute here: https://circleci.com/gh/dollarshaveclub/workflows/monitor/tree/master. See our CircleCI 2 Config.
With this monitoring solution, we were able to:
- Run our monitors every minute instead of every 5 minutes
- Test both our API and Browser monitoring scripts outside of a monitoring platform's console/UI. We were unable to do this with our Terraform setup.
- Use these monitoring scripts as tests for our version of Heroku Review Apps
- Additionally, we can develop monitors alongside our features, allowing us to merge them simultaneously. We no longer have conflicts between our codebase and our monitors.
- Able to easily create and manage hundreds of monitors, which is difficult with Terraform (excessive copy pasta) and any UI-based monitoring platform.
Some downsides to our CircleCI Scheduled Workflow setup are:
- Contention with your other tests. If you reach your CircleCI 2 container limit, your monitors will queue then run in bursts, losing coverage momentarily.
- May not be as fast as running monitors as Kubernetes jobs because CircleCI runs many commands like
npm install
on every build, which could be slower than just pulling a Docker container. However, having a CircleCI UI is preferable.
What about features other monitoring solutions provide?
- We must setup our own dashboards and alerting
- We still use other services for features we need, just not for monitoring everything
- We don't need to run these monitors from multiple locations. If we do, we'll run them as Kubernetes jobs on different clusters.
There are two ways to run these monitors.
To run monitors locally:
npx dsc-monitor 'monitors/**/*.js'
Run dsc-monitor --help
for options.
NOTE: this assumes you've installed this library as a local dependency, which is installed as dsc-monitor
.
If you're running the monitors from this repository, use ./bin/run.js
.
If you've npm install --global @dollarshaveclub/monitor
, just run dsc-monitor
.
Copy our Dockerfile Template to your repository, then run:
docker build -t dsc-monitor
docker run -t dsc-monitor 'monitors/**/*.js'
mkdir my-monitors # your repository name
cd my-monitors
npm init
npm i --save @dollarshaveclub/monitor
mkdir monitors
- Create a test monitor. You can use one of our example monitors.
- Add the
npm run monitors
command: - Add the following
script
to yourpackage.json
:"monitors": "dsc-monitor 'monitors/**/*.js'"
- Run your monitors with
npm run monitors
- Setting up your monitors in CircleCI as a CRON job:
- Copy .circleci/template.config.yml to
.circleci/config.yml
and push
Monitor environment variables:
MONITOR_CONCURRENCY=1
- concurrency of monitors running at the same time- When
concurrency === 1
, results will stream tostdout
- When
concurrency >= 1
, results will be logged one monitor set at a time
- When
MONITOR_SHUFFLE
- whether to shuffle monitors and monitor setsMONITOR_SHUFFLE_MONITOR_SETS
- whether to shuffle monitor setsMONITOR_SHUFFLE_MONITORS
- whether to shuffle monitors within a set
All monitoring sets are defined in monitors/
.
Each set is a module with:
exports.disabled<Boolean> = false
- whether this monitor is disabledexports.id<String> = __filename [optional]
- an ID for your monitor set, defaulting to the filenameexports.slowThreshold<Number|String> = 30s [optional]
- slow threshold for the entire monitor setexports.parallelism<Number> = 1 [optional]
- split this monitor set into shards and run in parallelexports.monitors<Array>
- an array of monitors with the following properties:id<String> [required]
- the ID of the monitorparameters<Object> [optional]
- parameters to send to the monitor function and for data purposesmonitor<Function>(monitorConfig, monitorSetConfig, { attempt, log }) [required]
- the monitor function, which is passed this monitor object as well asexports
monitorConfig
- thismonitor
objectmonitorSetConfig
- thisexports
objectattempt = 0
- the attempt # for this monitorlog(str)
- a function to log in a nicely-formatted way
timeout<Number|String> = '5s' [optional]
- timeout for the monitor before it's considered a failureslowThreshold<Number|String> = '1s' [optional]
- slow threshold for a monitorretries<Number> = 0 [optional]
- number of times to retry a failing monitor
- Optional functions to run within the life cycle of the monitoring set:
exports.beforeAll<Function>(monitorSetConfig)
exports.afterAll<Function>(monitorSetConfig, result)
exports.beforeEach<Function>(monitorConfig, monitorSetConfig, { attempt, log })
exports.afterEach<Function>(monitorConfig, monitorSetConfig, { attempt, log })
What certain fields do:
slowThreshold
- turns the color of the time fromgreen
toyellow
when a monitor or set of monitors take this amount of time
Create a file named dsc-monitor.js
with the form:
module.exports = (monitorRunner) => {
}
Then pass it as a plugin (-p
) when you run the monitors:
dsc-monitor -p dsc-monitor.js 'monitors/**/*.js'
Hook into events via monitorRunner.events.on(<event>, callback)
. The events are:
monitorSet
=>(result) => {}
- when a monitor set is completedmonitorSetConfig
results
- array ofmonitor
resultssuccess = true|false
elapsedTime
- in milliseconds
monitor
=>(result) => {}
- when a monitor is completedmonitorSetConfig
monitorConfig
results
- array ofmonitorAttempt
resultssuccess = true|false
elapsedTime
- in milliseconds
monitorAttempt
=>(result) => {}
- when a monitor attempt is completedmonitorSetConfig
monitorConfig
success = true|false
elapsedTime
- in millisecondserror
- if an error occuredattempt = 1
- attempt #
See CircleCI 2 workflow scheduling: https://circleci.com/docs/2.0/workflows/#scheduling-a-workflow. You can work off our .circleci/config.yml template
See all builds on master of workflow monitor
without a commit attached to it: https://circleci.com/gh/dollarshaveclub/monitor/tree/master
Or just look at the monitor
workflow: https://circleci.com/gh/dollarshaveclub/workflows/monitor/tree/master