Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: Fix a stop discovery channel issue #612

Merged
merged 2 commits into from
Jul 25, 2023

Conversation

diconico07
Copy link
Contributor

What this PR does / why we need it:
The stop discovery channel is shared by all Configuration resources of a given discovery handler endpoint, this channel was used without giving any context of the targeted DiscoveryOperator, leading to a stop of discovery for all Configurations when a Configuration is modified or deleted.

This commit makes the stop discovery channel send the concerned Configuration ID (namespace + name) on the channel, allowing to filter out requests targeting another DiscoveryOperator, the channel is still shared so the cases where we really want to stop discovery for all Configurations still works.

Fixes #601

Special notes for your reviewer:
Currently in draft as I still need to add some tests

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

@kate-goldenring
Copy link
Contributor

It's been awhile since I've looked at this, but aren't we creating a discovery operator for each configuration, regardless of whether they have a shared Discovery Handler / protocol?

let (stop_discovery_sender, _): (broadcast::Sender<()>, broadcast::Receiver<()>) =

@diconico07
Copy link
Contributor Author

We indeed create a discovery operator per configuration, but the stop_discovery_sender you highlight here triggers a stop_all_discovery from the operator, that will iterate over the discovery handler's endpoints to tell each of them to stop discovery.

for (endpoint, dh_details) in discovery_handler_details_map.clone() {
match dh_details.close_discovery_handler_connection.send(()) {

This is needed as we do an internal_do_discover per handler's endpoint (eventually do_discover_on_discovery_handler calls it).

for (endpoint, dh_details) in discovery_handler_details_map.clone() {
trace!(
"do_discover - for {} discovery handler at endpoint {:?}",
config.spec.discovery_handler.name,
endpoint
);
let discovery_operator = discovery_operator.clone();
let kube_interface = kube_interface.clone();
discovery_tasks.push(tokio::spawn(async move {
do_discover_on_discovery_handler(
discovery_operator.clone(),
kube_interface.clone(),
&endpoint,
&dh_details,
)
.await

And as the discovery_handler_map (and thus the close_discovery_handler_connection) is shared at agent level, we need to do it this way.
We could also have another broadcast channel at operator's level that is not shared between operators, but we would still need the shared one to correctly handle endpoint changes, so I found it easier this way.

The stop discovery channel is shared by all Configuration resources of a
given discovery handler endpoint, this channel was used without giving
any context of the targeted DiscoveryOperator, leading to a stop of
discovery for all Configurations when a Configuration is modified or
deleted.

This commit makes the stop discovery channel send the concerned
Configuration ID (namespace + name) on the channel, allowing to filter
out requests targeting another DiscoveryOperator, the channel is still
shared so the cases where we really want to stop discovery for all
Configurations still works.

Signed-off-by: Nicolas Belouin <[email protected]>
Signed-off-by: Nicolas Belouin <[email protected]>
@diconico07
Copy link
Contributor Author

@bfjelds I added/modified the comments, please tell me if it better now

@diconico07 diconico07 merged commit 46ed7d3 into project-akri:main Jul 25, 2023
@diconico07 diconico07 deleted the fix-freeze-on-modification branch July 25, 2023 06:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Changing a Configuration freezes the other ones for that discovery handler
3 participants