-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seeing error "waitid: no child processes" from run-to-completion processes while reaping #11
Comments
Yep I'm able to reproduce this fairly easily, most subprocesses I run and wait() while go-reaper was running give me |
@ahmetb, hmm ok that does seems a bit weird. One way that waitid is likely going to fail is if the process that you forked off has already exited and your code has consumed the output before the But I would have thought that those 2 commands you mentioned ( So on the debugging end, I'd check the A simple method would be to do like some old style debugging ala with printf's at:
And see what the flow is like. Alternatively, does using
instead of Thanks. |
I opted in for using a proper pid 1 init process in my application so I'm no longer using this. The forked processes were definitely running for a while. |
@ahmetb could you be more specific? I'm having this issue also. |
@jadolg do you have a standalone program I can reproduce this with? Thanks. |
@ramr It's a bit tricky because of the nature of the software I'm making. |
@jadolg Also there's a test in the code which looks like is similar to the issue you are facing. So try this:
and
Or am I misreading your comment in that it is the second call with If so, you could probably change the code for that slightly ala instead of: use something like:
Does that help? |
Hi @ramr
which is exactly the problem I'm trying to solve as every container that exits will leave behind a defunct |
@jadolg what about if you remove setting the
|
Whenever I use
lfsmount does not get killed. It does not matter If I set pgid or not |
@jadolg is there a way to run this inside a docker image? Basically some image that I can pull down and run to debug it. I tried to build the plugin from the README but I seem to be hitting some docker errors (enabling the plugin - |
@jadolg can you also send/attach the output from ps which shows the process group id of the hanging process? Maybe the forked lfsmount process is using a different process group id. |
It's a bit hard to debug cause the Docker plugins because it even uses a file for socket.
and executing
which is only the process that mounts the root of my filesystem.
now, when I start a container
and this happens when killing the container
And here I'm setting SysProcAttr with the pgid option
This is the output when
And this when it's finished
I hope this can provide enough context for you. My guess is that |
@jadolg , ok looks like that is setting the process group id/creating a new session. So one solution here is to have the pid 1 do the reaping and have your code run inside a
And there's 3 different notes for you to adjust your code accordingly. Aside: hmm, maybe something can do from an config option perspective. Let me mull on that a bit as we do require an entry point to invoke as well. |
@ramr I needed to modify a bit the code calling |
@jadolg Cool. Glad that worked for you. But that said, I think from a reaper perspective, making this fix somewhat generic might be a better option. Let me mull on that. For now, I will probably just document the above mentioned usage in the README. Thx |
…init), so that any process management work done inside your code is unaffected. fixes #11
Similar to #2 but a little different.
After doing
go reaper.Reap()
in my program, I'm doing simple run-to-completion executions, such as:this is returning error
When I do not do
go reaper.Reap()
, things seem to work fine.For example, more weirdly, this happens only while running:
but not while running:
I have a suspicion that these two commands are different somehow.
How to debug this?
The text was updated successfully, but these errors were encountered: