The Dockerfile in this repo creates the base docker image for running scrapers in morph.
It's basically something very similar to the cedar platform on Heroku, with a script with a few extra libraries installed that we want to use.
So, if you need an extra system library installed in Morph.io for the scraper to use this is the likely repo that you will need to modify.
After updating this repo:
- Push to GitHub. This will trigger an automatic build on GitHub actions (see .github/workflows) which will push the final image to Docker Hub
- Wait until the build is complete (See https://github.com/openaustralia/buildstep/actions)
- Either deploy morph.io to force latest images to be downloaded or ssh to morph.io and
docker pull openaustralia/buildstep
This repository also contains the CA certificate that gets installed into morph.io containers so that the transparent mitmproxy works. It expires every few years and needs to be updated, to do this:
- Before you start you probably want to disable mitmproxy on the server. Run
iptables-morph-remove
on the server to do this - Install and run
mitmproxy
on your machine, this will create a set of certificates in~/.mitmproxy
- Check the expiry on
mitmproxy-ca-cert.pem
in that directory (Useopenssl x509 -in mitmproxy-ca-cert.pem -text -noout
)- it should be a few years off - Overwrite the
mitmproxy-ca-cert.pem
file in this repository with the one from your machine - Carry out the steps above "After updating this repository"
- Replace the certificates in the main morph.io repository by copying all 5 from
~/.mitmproxy
to that repository. Push your changes to GitHub and deploy morph.io - Re-enable the mitmproxy on the server by running
iptables-morph-add
Annoyingly we can't just use a certificate that expires a long time in the future (say 10 years). See mitmproxy/mitmproxy#815