-
-
Notifications
You must be signed in to change notification settings - Fork 756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplified worker monitoring and restart, without gunicorn. #1942
Conversation
There's a behavior change that may be unexpected for those running on k8s and expect the pod to be restarted during a crash. To cover those, I recommend having an unmanaged option |
I would agree, but I'm not sure it matters because this new functionality is only present if Unless I'm mistaken, k8s pods are running single worker mode (1 worker = 1 pod), because k8s is being used as the "worker manager". In the current |
break | ||
# Restart expired workers. | ||
for process in self.processes: | ||
if not process.is_alive(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not reliable. We can't know if the process is stuck. It will only give the false impression of a good process manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've considered this but I don't think it's reasonable to expect Uvicorn to be so involved, because determining "stuck" and the handling of "stuck" could be quite different for every project: It's up to the developer to decide if/when/how to kill a "stuck" worker.
Nor is it the point of this PR.
We want Uvicorn to just mimic k8s here- and only become involved when the worker is dead/crashed/exited, which is what is_alive()
does perfectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the description it says: "we're eliminating an entire dependency with only a few lines.", but gunicorn does check if the process is stuck, so it's the same scope, and this is not a replacement for it.
Sorry my bad, you are right, the multiprocess isn't used in the single worker case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should include this change given the "simplicity" and how it "lies" to users about being reliable.
But... If we want this to get in, at least those should happen:
- Add tests.
- Answer the question: why this shouldn't be an opt-in feature?
@@ -6,6 +6,7 @@ on: | |||
branches: ["master"] | |||
pull_request: | |||
branches: ["master"] | |||
workflow_dispatch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without creating a PR you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct. It allows for running the test suite manually on your own fork or branch prior to creating a pull request.
break | ||
# Restart expired workers. | ||
for process in self.processes: | ||
if not process.is_alive(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the description it says: "we're eliminating an entire dependency with only a few lines.", but gunicorn does check if the process is stuck, so it's the same scope, and this is not a replacement for it.
It seems like advanced monitoring should be opt-in and simple "is alive" monitoring should be the default since Python doesn't let you "opt-out" of dependencies. |
The current behavior is broken:
|
We're not doing this. I prefer to have reliable restart, as I said on #1942 (comment). I'll make @abersheeran 's PR happen instead: #2183 👍 |
Drastically simplifies app deployment for Starlette and FastAPI for many users.
Survive worker crashes directly in Uvicorn- go ahead and run your ffmpeg and machine learning tasks. Uvicorn will auto restart your workers up to the desired
--workers
count, reliably, even under heavy load.Related issues
Deployments right now...
caddy/nginx ➡️ gunicorn ➡️ uvicorn ➡️ starlette/fastapi
For many users can be simplified to...
caddy/nginx ➡️ uvicorn ➡️ starlette/fastapi
Stress tested to 100,000's of r/s using using hey.
Thoughts, feedback appreciated.
I realise this isn't as "feature rich" as some may want (handling various signals, etc), but we're eliminating an entire dependency with only a few lines. It works well and is a relatively tiny change for big benefits, and can be easily re-factored out if something better is implemented in the future.