-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor youki delete to match runc/crun. #1654
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1654 +/- ##
==========================================
- Coverage 68.99% 68.91% -0.09%
==========================================
Files 120 120
Lines 13155 13166 +11
==========================================
- Hits 9076 9073 -3
- Misses 4079 4093 +14 |
Hey, I'll take a loot at this later today 👍 |
Thank you. |
Looks good to me. 💯 |
Refactored the delete to match runc and crun. Now youki allow deletion of created containers. Signed-off-by: Eric Fang <[email protected]>
Signed-off-by: Eric Fang <[email protected]>
Signed-off-by: Eric Fang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I have left a couple of comments, but overall, looks great! Really nice commenting 👍
self.do_kill(signal::Signal::SIGKILL, false)?; | ||
self.set_status(ContainerStatus::Stopped).save()?; | ||
} | ||
ContainerStatus::Creating | ContainerStatus::Running | ContainerStatus::Paused => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the Creating
state also handled along with Created
? If I'm correct, then even at the time of Creating
state, there should be only one process in the container's cgroup. If that is correct, then we don't get any advantage / have any reason to use kill all with force. I checked the spec, and it specifies that states other than stopped
should generate error, but we are going according to runc anyways.
Please correct me If my understanding was wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically 2 process, container init and intermediate process before it exits. With that being said, it is possible I am over optimizing in this case. The goal here is to kill all the process in the container cgroup, so I should possibly just call kill all here for either cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, If there are two processes, then this logic makes perfect sense.
Also, I don't think we should merge Created
with rest 3 (if that was what you meant) , because as per runc and crun, we need to allow deleting Created
without force flag, however need force flag for others.
If there are going to be (even technically) two processes, in Creating
state, it is better to handle it by killing all in cgroup, just to be safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don't think we should merge Created with rest 3
No we should not. I wanted a single match
statement for different cases to make the code more readable. In this way, we don't have to mess with if statement on status + force
flag.
If there are going to be (even technically) two processes, in Creating state, it is better to handle it by killing all in cgroup, just to be safe.
This is what I believe as well.
Err(err) => { | ||
// There is a brief window where the container state is | ||
// created, but the container config is not yet generated | ||
// from the OCI spec. In this case, we assume as if we | ||
// successfully deleted the config and moving on. | ||
log::warn!("skipping loading youki config due to: {err:?}, continue to delete"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to worry about race condition? For eg, if we are deleting a container that is being created, then there might be a window where the config is not generated, and we are treating it as deleted. If that does get created after this moment, then should we be concerned about it or not?
If we are treating the delete root after this as catch all, then I guess this is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You asked a very good question and here is what I think. First of all, the actual bug is addressed in (#1649) and this PR only adds a warning and an explanation, so the context of the bug is in #1649 and #1640. In short, if rootless::New errors for whatever reason, we are stuck in a state where container state is created but not the youki config. This will cause the delete to not able to clean up because it is missing the youki config.
Here is the complex part about the race condition and you are right that there is a potential race here. If process 1 is calling youki create
and process 2 is calling youki delete --force
, we have a window where process 1 has not launched the process yet, but process 2 is going ahead to delete. There is no way for process 2 to signal process 1 to stop creating the container processes with force flag. What will happen is that process 2 will delete the container root and process 1 will continue to start the process wand wait on the exec.fifo, which is a file (named pipe). So when container root is deleted, the exec.fifo will be gone and the process waiting on it will error out. Therefore, both should be clean up correctly. Or process 1 at any point trying to save the state and fails. So in theory, the race condition will not cause leakage in terms of process or files on the system. I believe this the best we can guarantee. I think this covers all the situations, but please double check my work here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I think this makes sense 👍
I think this should solve the issue and not cause any leakage of resources.
If possible, give me a day's time, I'll try to do a manual verification check by launching parallel processes and checking that resources are cleaned up.
In either case, I think this implementation is good enough :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, I think this makes sense 👍 I think this should solve the issue and not cause any leakage of resources. If possible, give me a day's time, I'll try to do a manual verification check by launching parallel processes and checking that resources are cleaned up.
In either case, I think this implementation is good enough :)
Take you time please :) I am curious as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @yihuaf Unfortunately, I didn't get time in last week to do this 😓 I'll try to get this done soon 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option is to merge this PR as is, since the PR is fixing a specific issue. We can continue to test the edge case on main and file issue as needed.
Signed-off-by: Eric Fang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Thanks for the changes, and apologies on my behalf for delay 😅
The discrepancy in deleting a
created
status container withoutforce
flag is first brought up in #1556. See the PR for detailed discussion. In this PR, we fix the discrepancy and refactored the delete code to be more readable. In this way, it is clear that we first check the condition if container can be deleted. Then perform the delete. We also add a debug assert to make the the invariant in not broken when deleting.