Your task is to implement a program that monitors the availability of many websites over the network, produces metrics about these and stores the metrics into an PostgreSQL database.
The website monitor should perform the checks periodically and collect the request timestamp, the response time, the HTTP status code, as well as optionally checking the returned page contents for a regex pattern that is expected to be found on the page. Each URL should be checked periodically, with the ability to configure the interval (between 5 and 300 seconds) and the regexp on a per-URL basis. The monitored URLs can be anything found online.
The solution should NOT include using any of the following:
- Database ORM libraries - use a Python DB API or similar library and raw SQL queries instead.
- External Scheduling libraries - we really want to see your take on concurrency.
- Extensive container build recipes - rather focus your effort on the Python code, tests, documentation, etc.
- Clone the repo
- Create a virtual environment of a choice, e.g. with
venv
:
python3.11 -m venv .venv
- Activate the virtual environment, e.g. for
venv
and linux:source .venv/bin/activate
- Install the package:
pip install .
To test with the local PostgreSQL, the attached docker-file can be used as follows:
docker compose --env-file .test.env up
Then to use it:
monitor --settings example_settings.yaml --envfile .test.env --verbose
Thanks to argparse
, there is a help as well:
monitor --help
usage: monitor [-h] --settings SETTINGS [--verbose] [--envfile ENVFILE]
Util to monitor and log state of websites
options:
-h, --help show this help message and exit
--settings SETTINGS Path to settings.yaml file
--verbose Set logging level to DEBUG
--envfile ENVFILE Path to environment file with credentials of postgres