Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added healthcheck route #11181 #13117

Closed
wants to merge 1 commit into from
Closed

Conversation

lc0305
Copy link

@lc0305 lc0305 commented Jul 5, 2021

Hello.

This healthcheck route is required for using Ghost behind a cloud-service load balancer which can not manually set "X-Forwarded-Proto: https".

Issue: #11181

  • There's a clear use-case for this code change
  • Commit message has a short title & references relevant issues
  • The build will pass (run yarn test and yarn lint)

@TobiasWust
Copy link

this is really usefull

@Dynamic-Gravity
Copy link

This is exactly what we need

@lc0305
Copy link
Author

lc0305 commented Sep 23, 2021

This is exactly what we need

You can work around this by allowing 3xx HTTP status codes as succesful with the AWS application load balancer since the maintainers apparently do not care whether their product is actually useful within cloud infrastructure.

@ErisDS ErisDS self-assigned this Oct 21, 2021
@ErisDS
Copy link
Member

ErisDS commented Feb 1, 2022

I'm sorry to have neglected this PR for so long 😳

The thing that's missing for me here is reference material / precedent or a spec for what the URL should be. Why /health? Can we point to other platforms that have this? Is it the most common endpoint?

I'm happy to accept a PR of this nature but would just like to know I'm not gonna get another one right away suggesting we change it or add a second more popular route! 😬

@darksworm
Copy link

/health a common endpoint for a generic healthcheck endpoint. So is /healthcheck. Unless you need something more specific, either is fine (for example, if your app takes 3 minutes to "warm up", you might want to separate the healthcheck in to multiple endpoints /health/warmup & /health/live)

@ErisDS ErisDS closed this May 27, 2022
@ErisDS ErisDS reopened this May 27, 2022
@ErisDS
Copy link
Member

ErisDS commented May 27, 2022

Didn't mean to close this - I am looking for some specific reference material for what a healthcheck endpoint needs to do and common patterns for naming.

If it's just an endpoint that returns a 200 when the service is running, then why not ping / or /ghost/api/admin/site? I'm sure there is a reason!

Looking again, I'm also not sure how this PR mitigates the problem of not being able to set "X-Forwarded-Proto: https" 🤔

@Dynamic-Gravity
Copy link

Dynamic-Gravity commented May 27, 2022

@ErisDS The health checks purpose is to primarily let a load balancer or monitoring to determine the health about the service.

Here is a good primer. https://testfully.io/blog/api-health-check-monitoring/

If it's just an endpoint that returns a 200 when the service is running, then why not ping / or /ghost/api/admin/site? I'm sure there is a reason!

Pings are not sufficient they only tell if the service is reachable. This is why there are typically two: one that monitors if reachable and then another that determines service readiness. Since it might take a bit for Ghost to contact a remote database or spin up other components. If we have a couple instances behind a load balancer that routes traffic between multiple nodes based on service health, as is common for clustering, then us admins need this functionality in order to route with confidence to our nodes.

Edit: here is the Kubernetes documentation regarding this. Which is what I am assuming most people are complaining about when trying to use Ghost.

Ideally, there would be a liveliness and readiness endpoint.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

@ErisDS
Copy link
Member

ErisDS commented Jul 27, 2022

I'm going to close this for now for a few reasons:

  1. there's no clarity here in response to my questions (I was not asking what healthchecks are for)
  2. we've restructured the codebase meaning this would need a rebase
  3. The OP has (quite understandably) moved on

I'm more than happy to see a new PR that adds healthchecks to Ghost, with some description of why the proposed implementation is correct & serves the most use cases. My aim is to end up with one set of healthcheck tools that work for everyone, and not have to keep adding new routes that do a slightly different thing.

@ErisDS ErisDS closed this Jul 27, 2022
@Dynamic-Gravity
Copy link

there's no clarity here in response to my questions (I was not asking what healthchecks are for)

Was my direct response to you previously insufficient? What sort of resource were you expecting?

we've restructured the codebase meaning this would need a rebase

Fair enough, but this languished for too long.

This issue is still worth pursuing.

Thank you for your time.

@ErisDS
Copy link
Member

ErisDS commented Jul 28, 2022

I agree this languished too long. I'm more than happy to support a new PR for this with a thorough solution that addresses all the concerns both in this thread and the OG issue #11181.

I'm looking for resources that demonstrate precedent for and a solution that covers:

  • The name and location of the endpoint
  • The exact response code and body
  • Handling proxies and HTTPS correctly
  • When in Ghost's lifecycle a response should be served

This PR added a route at /ghost/api/health/ that returns a 200 status code and the text "SUCCESS". The route is mounted at the end of the boot cycle (so will return 503s whilst Ghost is booting) and also has no handling for anything to do with proxies or https despite this being mentioned in the PR description.

The aim is to merge a single endpoint that will work for everyone and is more useful than the existing endpoint that can be used for healthchecks which lives at /ghost/api/admin/site and returns a 200 and a JSON blob once Ghost has booted. I don't see that the proposed code in this PR gives us anything over and above /ghost/api/admin/site so I didn't see a reason to merge it.

@danmasta
Copy link

danmasta commented Aug 20, 2022

@ErisDS According to ghost acces logs the /ghost/api/admin/site endpoint returns a 301 status code which is incompatible with many cloud load balancers. Many cloud load balancers require a 200 status code only for health checks

Access Logs from ghost v5.10.1:

[2022-08-20 06:30:22] INFO "GET /ghost/api/admin/site" 301 1ms

Access Logs from ghost v4.48.2:

[2022-08-20 06:17:29] INFO "GET /ghost/api/admin/site" 301 1ms

You can see requests to the endpoint /ghost/api/admin/site in both versions 4.x and 5.x are returning 301 HTTP status codes

This was tested in a production kubernetes environment via gke with the following health check configuration:

livenessProbe:
    httpGet:
        path: /ghost/api/admin/site
        port: http
    initialDelaySeconds: 5
    periodSeconds: 5
    timeoutSeconds: 1
readinessProbe:
    httpGet:
        path: /ghost/api/admin/site
        port: http
    initialDelaySeconds: 5
    periodSeconds: 5
    timeoutSeconds: 1

As a workaround in v4.x we extended the api router with a custom health check endpoint at /ghost/api/admin/site/health, but that no longer works since the structure of the app has changed in v5.x. It's annoying to rebuild that functionality for each major version change.

Can you please re-evaluate adding a health check only endpoint that does not use redirects and returns only a 200 status code with an empty response body?

Copy link
Member

@danmasta you're missing a trailing slash on your URLs, the requests should be made to GET /ghost/api/admin/site/

@danmasta
Copy link

danmasta commented Aug 20, 2022

Hi @kevinansfield, I added the trailing slash and I'm still seeing a 301 status code:

[2022-08-20 23:10:39] INFO "GET /ghost/api/admin/site/" 301 1ms

When I exec into the container and test with curl it looks like it tries to redirect to the ghost url config:

curl http://localhost:2368/ghost/api/admin/site/
Moved Permanently. Redirecting to https://$GHOST_URL/ghost/api/admin/site/

When testing the same thing in a local docker container with default config it does work correctly with no redirects:

curl http://localhost:2368/ghost/api/admin/site/
{"site":{"title":"Ghost","description":"Thoughts, stories and ideas","logo":null,"icon":null,"accent_color":"#FF1A75","url":"http://localhost:2368/","version":"5.10"}}

It seems like the issue is the /ghost/api/admin/site/ api endpoint is subject to ghost redirection logic, particularly in production with the url configuration set. We really need an endpoint that exists outside that redirect logic in order to be useful for health checks

@samrap
Copy link

samrap commented Sep 21, 2022

Recommendation for anyone at the end of this thread:

Write a sidecar service (I did one in Go) that monitors the systemd status and exposes it via a separate HTTP port. You can use this to evaluate health.

So for example, my Go service just exposes an HTTP server which returns 200 if Ghost is running on the system or 5xx if not. You can throw that server on, say port 8000, and configure your LB to hit that endpoint/port for health checks.

@ErisDS
Copy link
Member

ErisDS commented Sep 28, 2022

@danmasta it looks like you're having configuration issues, which would have been better suited to the forum.

@samrap it sounds like you've found something that works for you, but without really explaining why the existing suggested workaround didn't work for you - which doesn't really help with building up a plan for what a healthcheck should look like in Ghost.

The bottom line is, I am pretty sure this PR doesn't do what anyone thinks it does - there's no one in this thread saying they've used this in production and it works as a healthcheck. I can't see that this new /ghost/api/health/ endpoint adds anything over and above what /ghost/api/admin/site/ already does.

On the flip side, I do acknowledge completely that a healthcheck route that has some functionality beyond /ghost/api/admin/site/ (e.g. doesn't enforce HTTPS) might be a valid addition to Ghost, but from this PR and thread I have zero clarify on what the actual requirements would be for such a route, nor what the standards are used in other tools.

I suggest that anyone interested in getting this over the line open a thread in the Contributing section of the Ghost forum with an outline of a spec and some reference material for why that spec is correct for collaboration - OR if you don't think it needs discussion, open a new PR with the same info and the solution.

For now, I'm going to lock this thread and again clarify that I would be very happy to merge a working, tested, useful healthcheck route into Ghost.

@TryGhost TryGhost locked as resolved and limited conversation to collaborators Sep 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants