Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically sync all data to new nodes in an existing zone #7176

Open
widhalmt opened this issue May 10, 2019 · 7 comments
Open

Automatically sync all data to new nodes in an existing zone #7176

widhalmt opened this issue May 10, 2019 · 7 comments
Labels
area/distributed Distributed monitoring (master, satellites, clients) enhancement New feature or request stalled Blocked or not relevant yet

Comments

@widhalmt
Copy link
Member

Hi,

Our current approach to add new endpoints to existing zones is to manually sync parts or the whole /var/lib/icinga2/ as stated in https://github.com/Icinga/icinga2/blob/c2542710b7517bdbc3b14b7f5476e94a2785e581/doc/06-distributed-monitoring.md#distributed-monitoring-advanced-hints-initial-sync

This might be "good enough" for rather static setups but I'd really like to see a reliable automatic sync to new endpoints in the long term.

Why?

  • Because it helps a lot with automation
  • Because it helps a lot with very dynamic setups (think cloud deployments, containerized deployments)
  • Because even when we now have it in the documentation people tend to miss important steps when reading documentation. Especially when they are out of line like this one (automatic sync for everything, save in this very special case)

I know, that's not as easy to implement as it looks in the first place and there are more pressing issues but please keep this in mind for upcoming changes in the cluster sync mechanisms.

@widhalmt
Copy link
Member Author

ref/NC/610207

@dnsmichi
Copy link
Contributor

This isn't about being easy or hard to implement, it is about performance. I had explained that to @lippserd this week.

Whenever an endpoint connects, it would need the full state sync of all involved objects. It doesn't know what exactly to receive, nor does the master node do. So you'll get a full object JSON blob over the line, resulting the endpoint to actually apply these things. You cannot subscribe to events applied from the icinga2.state file, the cluster connection happens any time in the future independent from this. You cannot know which downtimes/comments are to be synced, you need to go the full route.

Another problem is the design - now which endpoints are applicable for syncing such things, if you would live with the fact that you must sync the full state? All nodes in a zone, or should satellites also receive a synced state?

After all, we had this discussion years ago already, and have denied to add this. The main reason is performance making the cluster re-connect an even more blocking problem. With the now documented workaround again, this is the only help we may offer. Given the fact, that a secondary master setup happens just once or twice during a monitoring stack lifetime, I think we can agree on the fact that a manual sync compared to degraded performance each time a reload happens is the best option here.

Cheers,
Michael

@dnsmichi dnsmichi added area/distributed Distributed monitoring (master, satellites, clients) wontfix Deprecated, not supported or not worth any effort and removed wontfix Deprecated, not supported or not worth any effort labels May 10, 2019
@dnsmichi dnsmichi reopened this May 10, 2019
@dnsmichi
Copy link
Contributor

To re-iterate: More discussion needed.

@dnsmichi dnsmichi added the TBD To be defined - We aren't certain about this yet label May 10, 2019
@widhalmt
Copy link
Member Author

On a first guess I could think of nodes knowing that they are "new" in a zone and then trigger a full sync or or serial numbers/timestamps for configuration so a sync has only to be done when they differ - maybe even just the delta as part of the api log.

My main intention with this issue was to keep it in mind for further changes. Maybe it can't be done with the current implementation without sacrificing a lot of performance but this doesn't mean that future implementations will have the same flaw. I just hope that this is one of the things to be kept in mind when changing the cluster sync.

@dnsmichi
Copy link
Contributor

When changing the cluster sync, this only affects the static configuration file sync, nothing else. This is what #6716 is about, anything else won't be solved with it.

@widhalmt
Copy link
Member Author

I didn't specifically think of #6716 but I'm positive that Icinga will be around for quite some time. Maybe it's not fixed with 2.12 but maybe with 3.1 . So I want to have issues to remind everyone of things that were missing "back in the days" when designing new ways.

@dnsmichi
Copy link
Contributor

Ok, then I'll change the label.

@dnsmichi dnsmichi added stalled Blocked or not relevant yet and removed TBD To be defined - We aren't certain about this yet labels May 10, 2019
@Al2Klimov Al2Klimov added the enhancement New feature or request label Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/distributed Distributed monitoring (master, satellites, clients) enhancement New feature or request stalled Blocked or not relevant yet
Projects
None yet
Development

No branches or pull requests

3 participants