Automatically sync all data to new nodes in an existing zone #7176

widhalmt · 2019-05-10T08:01:09Z

Hi,

Our current approach to add new endpoints to existing zones is to manually sync parts or the whole /var/lib/icinga2/ as stated in https://github.com/Icinga/icinga2/blob/c2542710b7517bdbc3b14b7f5476e94a2785e581/doc/06-distributed-monitoring.md#distributed-monitoring-advanced-hints-initial-sync

This might be "good enough" for rather static setups but I'd really like to see a reliable automatic sync to new endpoints in the long term.

Why?

Because it helps a lot with automation
Because it helps a lot with very dynamic setups (think cloud deployments, containerized deployments)
Because even when we now have it in the documentation people tend to miss important steps when reading documentation. Especially when they are out of line like this one (automatic sync for everything, save in this very special case)

I know, that's not as easy to implement as it looks in the first place and there are more pressing issues but please keep this in mind for upcoming changes in the cluster sync mechanisms.

The text was updated successfully, but these errors were encountered:

widhalmt · 2019-05-10T08:03:28Z

ref/NC/610207

dnsmichi · 2019-05-10T08:49:33Z

This isn't about being easy or hard to implement, it is about performance. I had explained that to @lippserd this week.

Whenever an endpoint connects, it would need the full state sync of all involved objects. It doesn't know what exactly to receive, nor does the master node do. So you'll get a full object JSON blob over the line, resulting the endpoint to actually apply these things. You cannot subscribe to events applied from the icinga2.state file, the cluster connection happens any time in the future independent from this. You cannot know which downtimes/comments are to be synced, you need to go the full route.

Another problem is the design - now which endpoints are applicable for syncing such things, if you would live with the fact that you must sync the full state? All nodes in a zone, or should satellites also receive a synced state?

After all, we had this discussion years ago already, and have denied to add this. The main reason is performance making the cluster re-connect an even more blocking problem. With the now documented workaround again, this is the only help we may offer. Given the fact, that a secondary master setup happens just once or twice during a monitoring stack lifetime, I think we can agree on the fact that a manual sync compared to degraded performance each time a reload happens is the best option here.

Cheers,
Michael

dnsmichi · 2019-05-10T10:54:07Z

To re-iterate: More discussion needed.

widhalmt · 2019-05-10T11:00:14Z

On a first guess I could think of nodes knowing that they are "new" in a zone and then trigger a full sync or or serial numbers/timestamps for configuration so a sync has only to be done when they differ - maybe even just the delta as part of the api log.

My main intention with this issue was to keep it in mind for further changes. Maybe it can't be done with the current implementation without sacrificing a lot of performance but this doesn't mean that future implementations will have the same flaw. I just hope that this is one of the things to be kept in mind when changing the cluster sync.

dnsmichi · 2019-05-10T11:14:53Z

When changing the cluster sync, this only affects the static configuration file sync, nothing else. This is what #6716 is about, anything else won't be solved with it.

widhalmt · 2019-05-10T11:23:37Z

I didn't specifically think of #6716 but I'm positive that Icinga will be around for quite some time. Maybe it's not fixed with 2.12 but maybe with 3.1 . So I want to have issues to remind everyone of things that were missing "back in the days" when designing new ways.

dnsmichi · 2019-05-10T11:42:56Z

Ok, then I'll change the label.

dnsmichi closed this as completed May 10, 2019

dnsmichi added area/distributed Distributed monitoring (master, satellites, clients) wontfix Deprecated, not supported or not worth any effort and removed wontfix Deprecated, not supported or not worth any effort labels May 10, 2019

dnsmichi reopened this May 10, 2019

dnsmichi added the TBD To be defined - We aren't certain about this yet label May 10, 2019

dnsmichi added stalled Blocked or not relevant yet and removed TBD To be defined - We aren't certain about this yet labels May 10, 2019

Al2Klimov added the enhancement New feature or request label Aug 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically sync all data to new nodes in an existing zone #7176

Automatically sync all data to new nodes in an existing zone #7176

widhalmt commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019

dnsmichi commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019

Automatically sync all data to new nodes in an existing zone #7176

Automatically sync all data to new nodes in an existing zone #7176

Comments

widhalmt commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019

dnsmichi commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019

widhalmt commented May 10, 2019

dnsmichi commented May 10, 2019