-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pebble service is not restarted on on-check-failure
if the OS process was killed
#149
Comments
Hi @marceloneppel, thanks for the report. This is a strange one. The health check shouldn't actually be needed here, as the default behaviour for
Note that the DEBUG lines will only be visible if you start Pebble with the Can you try a couple of things for me?
And let me know whether you still see the same behaviour then. If not it's something Patroni-specific, and we can dig into your repro case with petroni a bit deeper. |
Hi @benhoyt! Thanks for all this information. I checked the things that you commented:
So, in summary, that is something specific to Patroni. I am going to get some more information and maybe create another simpler script that works similar to Patroni (like you commented, maybe running as a daemon or forking), so we can understand better what is happening in that case. |
Were there any other logs after that last line? From following the code through: after |
No, unfortunately there were no more logs. I waited for ten minutes and checked the logs again. The last line in the logs is |
@benhoyt, I could create a simple script that shows the same behaviour that Patroni has when it is killed and Pebble doesn't restart it. It is a script that starts another process using import subprocess
import time
subprocess.Popen(["/usr/bin/python3", "/home/app/script2.py"])
# Just a loop to keep the program running.
while True:
time.sleep(30) The other script that is started (script2.py) can be anything (in my case I used the TCP server code that you shared with me). After killing the the first script process, I get the following logs:
You can check all the files here. |
Thanks for the minimal repro -- I can repro this easily locally now. It will probably happen for any service that itself starts up a child process. What's happening is Pebble's "process reaper" detects that the parent PID exited, and then it calls Go's So if you just send SIGKILL to the parent process, Pebble won't detect that it died ... until all the child processes die or are killed too. You note that it's different from supervisord (I haven't tested systemd). I could be wrong, but this seems to me a feature, not a bug: it wouldn't be ideal to be restarting the parent process when there are running child processes still around. In the case of Patroni, it would mean Pebble would try to restart Patroni, which would restart a bunch of Postgres processes ... but Postgres is already running. Is the right "fix" here for you to kill the process group instead of just the Patroni process? You can do this with Note that Pebble itself uses SIGTERM/SIGKILL on the process group when it executes a |
I've discussed this with @hpidcock, and he noted the systemd docs for KillMode and ExecStop -- it's a bit tricky to piece together, but they imply that in the case where the parent service is SIGKILL'd, systemd will send the SIGTERM...SIGKILL sequence to the remaining processes in the process group. This is when KillMode is set to I've verified this will a little test systemd service:
When you send SIGTERM or SIGKILL to the parent process and kill it, systemd sends SIGTERM to the child process too. So between supervisor and systemd, that definitely seems like the right default behaviour. We'll need to fix this -- updating the service manager code to kick off the terminating state when it detects that child processes are still running. In the meantime, I'd still love to hear more context about your use case here. |
Hi Ben! Thanks again for all this information about Pebble. It is great for me to understand some more aspects of it. I haven't tried to kill the Patroni process group, but I can test it here. Besides that, nice that it's something that could be tested on systemd to validate the behavior. Sorry, I forgot to give more details about my use case. It isn't a use case related to a feature that I need to implement, it is more related to a shakedown test that we need to implement in different database charmed operators that we are developing on the Data Platform team. In my case, I need to test that the PostgreSQL charmed operator recovers correctly from some situations, like if the database process is killed, after it's freezed and later unfreezed, and also other situations that should cause some disruption in the service (for that I am currently using SIGKILL, SIGSTOP, SIGCONT and other signals, on both the Patroni process - the parent process - and the PostgreSQL process - the child process). The final idea is to test that a cluster of multiple units keeps working and the failing unit recovers itself (becoming part of the cluster again). So in summary, this situation that I am trying to test is kind of a chaos scenario, to ensure that the charm (and Pebble) manages correctly the workload. |
Hi @benhoyt and @hpidcock, nice to meet you!
Let me summarize our use case:
As you confirmed, the current Pebble behavior is not fully replicate the default Systemd/Supervisor logic. Meanwhile, @marceloneppel is testing the workaround to send a KILL signal to the whole group instead of a single process. Thank you in advance! |
Thanks for the additional context @taurus-forever. I do hope to fix this at some point, however, for the next (few?) weeks we're finishing/fixing/testing Juju 3.0, so it'll almost certainly be after that. |
Hi @benhoyt and @taurus-forever! Thanks for the updates. I tested killing the process group, but it still doesn't make Pebble restart the service. I researched why and figured out that the command (pg_ctl) that Patroni is using to start the child process (PostgreSQL) is With those three scripts below we can check that the child process has another group: script1.py import subprocess
import time
subprocess.call(["/usr/bin/python3", "/home/app/script2.py"])
# Just a loop to keep the program running.
while True:
time.sleep(30) script2.py import os
import subprocess
pid = os.fork()
# Pid equal to 0 represents the created child process.
if pid == 0:
os.setsid()
subprocess.Popen(["/usr/bin/python3", "/home/app/script3.py"]) script3.py import http.server
import socketserver
PORT = 8081
Handler = http.server.SimpleHTTPRequestHandler
with socketserver.TCPServer(("", PORT), Handler) as httpd:
print("serving at port", PORT)
httpd.serve_forever() Logs: root@app-0:/# ps -e -o pid,ppid,pgid,cmd
PID PPID PGID CMD
1 0 1 /usr/bin/pebble run
14 1 14 /usr/bin/python3 /home/app/script1.py
17 1 16 /usr/bin/python3 /home/app/script3.py
19 0 19 bash
30 19 30 ps -e -o pid,ppid,pgid,cmd So it seems to be the case that you commented in the beginning: Patroni is forking (and also moving the new process to a new group). From what I checked, supervisord is restarting the service, but it keeps the original script3.py process running (that may be a problem in some cases, maybe). |
Just for the record, some further reading on this subject that I came across today: "The Unix process API is unreliable and unsafe" and his supervise project. Also Hacker News comments and another related article. Also this comment from a guy who works on the Bazel build system with the exact same issue. |
For the record, we plan to address this by using cgroups as part of work later in this cycle. |
Thanks for the update @benhoyt! |
Add WaitDelay to ensure cmd.Wait() returns in a reasonable timeframe if the goroutines that cmd.Start() uses to copy Stdin/Stdout/Stderr are blocked when copying due to a sub-subprocess holding onto them. Read more details in these issues: - golang/go#23019 - golang/go#50436 This isn't the original intent of kill-delay, but it seems reasonable to reuse it in this context. Fixes canonical#149
@marceloneppel Okay, I've come in a weird circle on this one. I had actually misdiagnosed the original problem with the Patroni issue you present above, where Patroni wasn't restarting properly even after a I thought this problem was the issue with Pebble not tracking the process tree of processes that daemonise and change their process group (which is still an issue, but is not causing this problem). But after actually diving into your patroni-restart-test repro in more detail, I found that Pebble wasn't marking the Patroni process as having exited at all, due to being blocked on In my tests with Patroni that fixes the problem nicely: when you send The reason I came in a weird circle with this is because I started down the track of "we need to use cgroups to solve this" too early. I even had a rough plan to do so, but it turns out using that cgroups in containers is not possible or not trivial in many cases, so that came to a head, and I started looking at other ways to solve the original problem, hence circling back around. I should have noticed this much earlier, sorry! We may still improve Pebble's process tree handling, but I believe this takes the pressure off, and we may be able to get away without doing that. In any case, my one-liner |
Hi @benhoyt! No worries and thank you for the effort on this! I'm gonna check it with the charm this week to confirm whether it worked and keep you posted. Thanks again! |
…/out/err (#275) Use os.exec's Cmd.WaitDelay to ensure cmd.Wait returns in a reasonable timeframe if the goroutines that cmd.Start() uses to copy stdin/out/err are blocked when copying due to a sub-subprocess holding onto them. Read more details about the issue in golang/go#23019 and the proposed solution (that was added in Go 1.20) in golang/go#50436. This solves issue #149, where Patroni wasn't restarting properly even after a `KILL` signal was sent to it. I had originally mis-diagnosed this problem as an issue with Pebble not tracking the process tree of processes that daemonise and change their process group (which is still an issue, but is not causing this problem). The Patroni process wasn't being marked as finished at all due to being blocked on the `cmd.Wait()`. Patroni starts sub-processes and "forwards" stdin/out/err, so the copy goroutines block. Thankfully Go 1.20 introduced `WaitDelay` to allow you to easily work around this exact problem. The fix itself is [this one-liner] (#275): s.cmd.WaitDelay = s.killDelay() * 9 / 10 // 90% of kill-delay This will really only be a problem for services, but we make the same change for exec and exec health checks as it won't hurt there either. Also, as a drive-by, this PR also canonicalises some log messages: our style is to start with an uppercase letter (for logs, not errors) and to use "Cannot X" rather than "Error Xing". Fixes #149.
Thanks so much @benhoyt! I double checked with the charm and the issue was fixed. To test, I used the following replace directive on Juju's
|
Issue
If the OS process related to a pebble service is killed (like using
pkill --signal SIGKILL -f process_name
) and that service has a health check that instructs pebble to restart it on a check failure (on-check-failure
field from the pebble layer spec), pebble tries to restart the process but finishes with anno such process
error. The number of check failures continues to grow and pebble doesn't try to, for example, just start the service, as there is no old OS process for that service to be stopped before the service start again.There are more details in an example repo that I used to test that behaviour. I added some more description and some logs comparing what happens with pebble and also when using another tool, in that case supervisor, to start the service and later kill the OS process.
Is there a way to solve that issue (making the service restart correctly using the pebble checks) using other fields from the pebble layer? I found a workaround that I implemented in my charm code, but I would like to confirm if this is a bug or not.
The text was updated successfully, but these errors were encountered: