Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail CI on missing image #685

Closed
mrjones-plip opened this issue Apr 25, 2022 · 5 comments
Closed

Fail CI on missing image #685

mrjones-plip opened this issue Apr 25, 2022 · 5 comments

Comments

@mrjones-plip
Copy link
Contributor

We recently had a missing image (404) that was live on production in #683 . We should first see if there's a built in hugo config that will fail to build the site (both locally and in CI) if an image is missing, just like when there's a bad link to a page. Baring that, we should see can revive the currently unused link checker to for broken images.

@mrjones-plip
Copy link
Contributor Author

mrjones-plip commented Apr 27, 2022

@garethbowen - so I took a look at this and in muffet, what we were using for link checking, frustratingly there's no "Just check self referential links" option or a reciprocal --include flag for the existing --exclude flag. So I hacked together this with the existing flags which is a bit crude, but it excludes common TLDs in a regex and then some false positives as well:

muffet http://localhost:1313 \
  --buffer-size 50000 \
  --timeout 35 \
  --ignore-fragments \
  --exclude ".*\.[com|org|io|it|int|net]" \
  --exclude ".*localhost.*" \
  --exclude ".*127.0.0.*"

With the gTLDs out there, this regex is sisyphean, but functional working well enought with the 6 above. I find your broken image on my out of date master in just under 3 seconds, so...works well enough?

➜  cht-docs git:(master) ✗ time ./.github/scripts/muffet.sh
http://localhost:1313/apps/tutorials/tasks-1/
        404     http://localhost:1313/apps/tutorials/tasks-1/first-task.jpg
./.github/scripts/muffet.sh  2.92s user 0.36s system 82% cpu 3.954 total

@garethbowen
Copy link
Member

Is the `--exclude ".localhost." line so it only checks relative links and not absolute? Otherwise I don't get how this works.

There's also this tool which does appear to offer an option for internal only.

Either way we should revive the full link checker as a cron job so if it consistently fails we can look into it.

@mrjones-plip
Copy link
Contributor Author

Ah - good point! I was being lazy with my regex when trying to avoid these false positives:

http://localhost:1313/apps/guides/hosting/app-developer/
        dial tcp4 127.0.0.1:8443: connect: connection refused   https://localhost:8443
http://localhost:1313/apps/reference/app-settings/outbound/
        dial tcp4 127.0.0.1:5984: connect: connection refused   http://localhost:5984/_utils/#/_config
http://localhost:1313/apps/guides/messaging/gateways/rapidpro/
        dial tcp4 127.0.0.1:5984: connect: connection refused   http://localhost:5984/_utils/#/_config

This can be fixed with this config:

muffet http://localhost:1313 \
  --buffer-size 50000 \
  --timeout 35 \
  --ignore-fragments \
  --exclude ".*\.[com|org|io|it|int|net]" \
  --exclude "http[s]*://localhost[8443|5984]*" \
  --exclude ".*127.0.0.*"

However, after looking into htmltest and finding the original CI link checker ticket which mentions it, I also found the "run nightly" ticket. Given we disabled the link checker because it found broken links unrelated to the PR where CI was run, maybe we should just turn on nightly checks? This would catch broken images AND 404s to external sites.

@garethbowen
Copy link
Member

I'm ok with the muffet excludes, or the htmltest, so go for whichever you think is appropriate.

Yes let's turn on the nightly checks - it's really easy and as long as it doesn't spam too much isn't disruptive. I suppose we can always add some exclusions if we find the same links are giving false negatives too often. We already use a nightly job for cleaning up the market: https://github.com/medic/cht-core/blob/master/.github/workflows/cleanup.yml

@mrjones-plip
Copy link
Contributor Author

Closing in favor of #366

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants