-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better callback registration/deregistration in host provider's lifecycle #2485
Conversation
This pull request is now in conflicts. Could you fix it? 🙏
|
…he Agent constructor
4454398
to
416c683
Compare
🌐 Coverage report
|
@@ -137,7 +153,7 @@ func (c *controller) Run(ctx context.Context) error { | |||
state.Context = localCtx | |||
state.signal = notify | |||
go func(name string, state *dynamicProviderState) { | |||
defer wg.Done() | |||
defer closeProvider(name, state.provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this changes the controller some. If you where to create a controller, run it / stop it, and run it again. Then the callback would not be registered because the close is called at the end of run, but the register is called in the New.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point. The closeable providers need to be closed when a controller is done with all its runs. Let me try to find an appropriate place in the controller lifecycle where this could happen. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -114,6 +112,16 @@ func ContextProviderBuilder(log *logger.Logger, c *config.Config, _ bool) (corec | |||
if p.CheckInterval <= 0 { | |||
p.CheckInterval = DefaultCheckInterval | |||
} | |||
|
|||
p.fqdnFFChangeCh = make(chan struct{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also turn this into a buffered channel so its not blocking when the callback hook is called but the provider is not running.
Now with it separate it is possible that no one is reading from the channel when it is pushed onto the channel.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Addressed in b0b55c3.
} | ||
func (c *contextProvider) onFQDNFeatureFlagChange(new, old bool) { | ||
// FQDN feature flag was toggled, so notify on channel | ||
c.fqdnFFChangeCh <- struct{}{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment about the buffer channel.
You only need to ensure that a struct is on the channel, if one is already there you really don't need to push it. So I would change this line to:
select {
case c.fqdnFFChangeCh <- struct{}{}:
default:
}
This removes any chance of blocking, but ensures that if Run is running or will be ran it will get the notification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Addressed in 4a5a7c2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes, changes look good!
…cle (#2485) * Implement CloseableProvider * Make hostProvider implement CloseableProvider * Update test to be more resilient * Move feature flags parsing from standalone configuration to late in the Agent constructor * Add explanatory comment for closeProvider function * Running mage fmt * Remove unused function * Add Close() to controller interface and implement it * Call composable controller's Close() method * Use buffered channel to prevent blocking * Don't block until Run() is running * Call composable controller's Close() in tests (cherry picked from commit 86c3395) # Conflicts: # internal/pkg/agent/application/application.go # internal/pkg/agent/cmd/run.go
…host provider's lifecycle (#2486) * Better callback registration/deregistration in host provider's lifecycle (#2485) * Implement CloseableProvider * Make hostProvider implement CloseableProvider * Update test to be more resilient * Move feature flags parsing from standalone configuration to late in the Agent constructor * Add explanatory comment for closeProvider function * Running mage fmt * Remove unused function * Add Close() to controller interface and implement it * Call composable controller's Close() method * Use buffered channel to prevent blocking * Don't block until Run() is running * Call composable controller's Close() in tests (cherry picked from commit 86c3395) # Conflicts: # internal/pkg/agent/application/application.go # internal/pkg/agent/cmd/run.go * Fixing conflicts --------- Co-authored-by: Shaunak Kashyap <[email protected]>
What does this PR do?
This PR makes the following changes to the host provider:
CloseableProvider
interface and updates the host provider to use this new interface. Specifically, theClose()
method on the host provider is used to deregister the FQDN feature flag change callback.It also changes when the feature flags from a standalone configuration are applied - at the very end of the Agent's construction process (
application.New()
).Why is it important?
Prior to this PR, callback registration (and deregistration) were happening inside the host provider's
Run()
method. This introduced a race condition between when the FQDN feature flag change callback registration happened and when the same callback might be called due to a change in the FQDN feature flag:By moving the callback registration to the host provider's constructor (and, less relevantly but symmetrically, the callback deregistration to the host provider's new
Close()
method), we now ensure that the callback will be registered when the provider is constructed. Providers are constructed in the controller's constructor, which is now called before feature flags from standalone configuration or fleet-managed configuration are applied and any callbacks are called.A secondary benefit of these changes is that the
host.TestFQDNFeatureFlagToggle
unit test is now no longer flaky.It was the flakiness that pointed to the race condition in the implementation prior to this PR.
Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added an entry in./changelog/fragments
using the changelog toolI have added an integration test or an E2E testAuthor's Checklist
How to test this PR locally
To check that the
host.TestFQDNFeatureFlagToggle
unit test is no longer flaky, run it a 1000 times:Related issues
TestFQDNFeatureFlagToggle
unit test: Increase sleep to avoid flakiness #2473require.Eventually
instead oftime.Sleep
in unit test #2480Use cases
Screenshots
Logs
Questions to ask yourself