Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move "FQDNs" grains calculation to "network" module and allow disabled them #52527

Conversation

meaksh
Copy link
Contributor

@meaksh meaksh commented Apr 12, 2019

What does this PR do?

This PR fixes an issue that is currently blocking the "core" grains execution for several seconds when the some "FQDNs" cannot be calculated.

Currently, in order to calculate the fqdns grain, all the IP addresses from the minion are processed sequentially. The problem here comes from the underlying calls to socket.gethostbyaddr which can takes 5 seconds until it's released (after reaching the socket.timeout) when there is no defined "fqdn" for that IP.

In some scenarios, depending on the minion network configuration, this situation makes the fqdn grain calculation to take even more than 15 seconds. This completely breaks minion execution not only at startup time but any time the minion refreshes its grains.

This PR makes the fqdn_lookup to be executed in parallel (using threads) so, in those cases we would only wait until "socket.timeout" is reached once and prevent from blocking the minion execution for such a long time.

Besides of that, the whole "fqdn" calculation logic has been moved to the network execution module. Since the process of calculating the "fqdns" can take too much in case of networking issues, it makes more sense to move if away from the "core" grains and also introduce a new configuration parameters called enable_fqdns_grains(default True) to allow not to calculate "fqdns" as part of the "core" grains rendering.

Tests written?

Yes

Commits signed with GPG?

Yes

@meaksh meaksh force-pushed the 2019.2-calculate-fqdns-in-parallel-to-avoid-blockings branch from e2f22c4 to 2ed509c Compare April 15, 2019 10:36
Copy link
Contributor

@brejoc brejoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People from gevent claim that this is indeed thread safe, even though the underlying C code isn't: https://github.com/gevent/gevent/blob/b01bcf66c9229d78100e93fe2bf57c3b6afcfedd/src/gevent/resolver/thread.py#L55-L59

lgtm 👍

@Akm0d Akm0d self-requested a review April 22, 2019 21:24
@meaksh
Copy link
Contributor Author

meaksh commented Apr 29, 2019

Do we have any feedback here?

@meaksh meaksh force-pushed the 2019.2-calculate-fqdns-in-parallel-to-avoid-blockings branch from 6324aa6 to 6e549dd Compare April 30, 2019 10:40
@meaksh meaksh force-pushed the 2019.2-calculate-fqdns-in-parallel-to-avoid-blockings branch from 6e549dd to 805efa4 Compare May 9, 2019 14:39
@meaksh meaksh requested a review from a team as a code owner August 7, 2019 09:40
ESiebigteroth and others added 2 commits September 6, 2019 13:03
* Duplicate fqdns logic in module.network
* Move _get_interfaces to utils.network
* Reuse network.fqdns in grains.core.fqdns
* Return empty list when fqdns grains is disabled

Co-authored-by: Eric Siebigteroth <[email protected]>
@meaksh meaksh changed the title Calculate the "FQDNs" grains in parallel to avoid long blockings Move "FQDNs" grains calculation to "network" module and allow disabled them Sep 6, 2019
@meaksh
Copy link
Contributor Author

meaksh commented Sep 6, 2019

We've pushed some extra changes to this PR that move the "fqdns" calculation away from the "core" grains to the network execution module. The "fqdns" grains is calculated by default but can be omitted by setting enable_fqdns_grains configuration parameter to False.

@max-arnold
Copy link
Contributor

max-arnold commented Sep 10, 2019

@meaksh Maybe it makes sense to use disable_grains (#48773 (comment)) or blacklist_grains (#49481) option to disable the grain?

Then we can avoid introducing yet another config option.

@dincamihai
Copy link
Contributor

@max-arnold i can see the same approach was taken here

if __opts__.get('enable_gpu_grains', True) is False:

It looks like blacklist_grains is just filtering the grains, not preventing from rendering. In this PR we want to prevent computing the grains because it is expensive and not reliable when there's a wrongly configured dns.

I see that disable_grains is not documented. Would that prevent computing the fqdns grains by not calling the fqdns function?

@max-arnold
Copy link
Contributor

max-arnold commented Sep 10, 2019

It looks like blacklist_grains is just filtering the grains, not preventing from rendering.

Then maybe it makes sense to improve on that PR (or contact the author) and actually skip rendering blacklisted grains.

I see that disable_grains is not documented. Would that prevent computing the fqdns grains by not calling the fqdns function?

It should, but it works on module-level (i.e. will disable the whole network grain module)

@dincamihai
Copy link
Contributor

It looks like blacklist_grains is just filtering the grains, not preventing from rendering.

Then maybe it makes sense to improve on that PR (or contact the author) and actually skip rendering blacklisted grains.

I see that disable_grains is not documented. Would that prevent computing the fqdns grains by not calling the fqdns function?

It should, but it works on module-level (i.e. will disable the whole network grain module)

fqdns is a core grain (not in a separate module) this means we can't use it here, right?

@max-arnold
Copy link
Contributor

Yes, unless it is moved to a separate module.

@dincamihai
Copy link
Contributor

Yes, unless it is moved to a separate module.

If we move it to a separate module we would change the way salt works at the moment.
I would avoid changing the grains structure.
With this PR, one doesn't have to add anything to existing configuration. Everything should work as before. Only when one wants to disable fqdns grain the new config param is needed.
Does it make sense? Have I manage to convince you? :)

@max-arnold
Copy link
Contributor

max-arnold commented Sep 10, 2019

Everything you said makes complete sense :)

Still, there is an apparent need for a generic way to disable grains, and introducing another one-off option seems suboptimal.

@dincamihai
Copy link
Contributor

Everything you said makes complete sense :)

Still, there is an apparent need for a generic way to disable grains, and introducing another one-off option seems suboptimal.

You are right about the generic way of disabling grains. I think it deserves a good and consistent implementation in its own PR.
I've removed the parameter to disable the fqdns grain from this PR. It contains now only the implementation with threads and the refactoring to allow fqdns retrieval both via grains and from network.fqdns module.

We will keep the disabling of this grain as a patch in our own salt package until the generic way to disabling grains will be available.

Is there anything else we can do to get this one merged?

@meaksh
Copy link
Contributor Author

meaksh commented Dec 10, 2019

Closing in favor of #55581

@meaksh meaksh closed this Dec 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants