Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update a child's reachable state as soon as possible #10143

Open
nilmerg opened this issue Aug 30, 2024 · 0 comments
Open

Update a child's reachable state as soon as possible #10143

nilmerg opened this issue Aug 30, 2024 · 0 comments
Labels
area/runtime Downtimes, comments, dependencies, events

Comments

@nilmerg
Copy link
Member

nilmerg commented Aug 30, 2024

Given a three level dependency hierarchy, the child on the lowest level is not unreachable in case a parent on the first level goes down, unless a check result arrives afterwards.

graph LR;
    ChildHost-->pa["ParentHostA (Group 1)"];
    ChildHost-->pb["ParentHostB (Group 1)"];
    pa-->ga["GrandParentA (Group 2)"];
    pa-->gb["GrandParentB (Group 2)"];
    pb-->ga["GrandParentA (Group 2)"];
    pb-->gb["GrandParentB (Group 2)"];
Loading

Here, ChildHost must be unreachable once both, GrandParentA and GrandParentB, are down. Sooner or later.

Expected behavior

I'm unsure whether we should update every child's reachable state in such a case. Since this will be updated anyway, once a check is performed, it is highly dependent on the interval of the check:

If the object is currently UP/OK, there's no real need to update it, since everything "is fine" and no-one should worry about it. Though, what if the object already has a problem? In case the check interval is relatively high, the reachable state is not going to update soon enough. But of course, the reason why it has already a problem, might not necessarily be related to a parent. So, maybe update such an object only if it's state is UNKNOWN?

Another very different case though, is in case the dependency configuration mandates that checks get disabled once the parent goes down. Then the child will never be checked again and the reachable state won't be updated at all without an explicit check issued by a user. Here I expect that, before disabling checks, the reachable state must be updated as well.

--

So, I'm certain what to expect in cases where checks will be disabled. But if that's not the case, is it really required to traverse the entire hierarchy to every child related in some way to the parent in question?

@Al2Klimov Al2Klimov added the area/runtime Downtimes, comments, dependencies, events label Sep 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Downtimes, comments, dependencies, events
Projects
None yet
Development

No branches or pull requests

2 participants