Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forcibly close DB and cache socket connection post-fork #8580

Merged
merged 1 commit into from
Jan 15, 2021

Conversation

ryanpetrello
Copy link
Contributor

@ryanpetrello ryanpetrello commented Nov 12, 2020

we've seen evidence of a race condition on fork for awx.conf.model.Setting
access; in the past, we've attempted to solve this by explicitly closing
connections pre-fork, but we've seen evidence that this isn't always
good enough

this patch is an attempt to close connections post-fork so that sockets
aren't inherited post fork, leading to bizarre race conditions in
setting access

def __clean_on_fork__(self):
pid = os.getpid()
if pid not in self.__dict__['__forks__']:
self.__dict__['__forks__'][pid] = True
Copy link
Contributor Author

@ryanpetrello ryanpetrello Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to access this via __dict__ instead of getattr to avoid a recursive loop (because __getattr__ calls this function via _get_local).

Copy link
Contributor Author

@ryanpetrello ryanpetrello Nov 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tested this out by printing this dict in a tight loop of awx-python forks, and the data structure for each process never grows beyond two keys (the parent pid, and the current process pid).

So in effect, this branch is entered one time when a child process does getattr(settings, ...), and then after that, it's a no-op.

@ryanpetrello ryanpetrello requested a review from matburt November 12, 2020 18:40
@softwarefactory-project-zuul
Copy link
Contributor

Build failed.

@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

we've seen evidence of a race condition on fork for awx.conf.Setting
access; in the past, we've attempted to solve this by explicitly closing
connections pre-fork, but we've seen evidence that this isn't always
good enough

this patch is an attempt to close connections post-fork so that sockets
aren't inherited post fork, leading to bizarre race conditions in
setting access
@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded.

Copy link
Member

@matburt matburt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This harkens back to when Ansible would change pids occasionally on us during event processing years ago.

@softwarefactory-project-zuul
Copy link
Contributor

Build succeeded (gate pipeline).

@softwarefactory-project-zuul softwarefactory-project-zuul bot merged commit 684998c into ansible:devel Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants