Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[7.7] [Monitoring] Collection changes #52639

Closed
cachedout opened this issue Dec 10, 2019 · 9 comments
Closed

[7.7] [Monitoring] Collection changes #52639

cachedout opened this issue Dec 10, 2019 · 9 comments

Comments

@cachedout
Copy link
Contributor

Monitoring changes in 7.x

From 7.7 through the end of the 7.x series in 7.9, components in the Elastic stack that use internal collection should introduce some changes which will ease the migration path for users when they are connecting a 7.7-7.9 cluster to a 8.x monitoring cluster. Those changes are described below.

Enhancement request to internal collection

Endpoint switch

The first change which needs to happen is to have internal collection in Kibana ship directly to the _bulk/monitoring endpoint.

Configuration option

Introduce config options to write directly to _bulk/monitoring.

These configuration options should follow the standard introduced by the Beats project. Namely, we should introduce settings prefixed by monitoring.elasticsearch and begin to issue deprecation warnings for the existing xpack.monitoring class of settings if monitoring.elasticsearch settings are not set.

Discussion points:

  1. If they are sending data from Kibana to their production cluster and it is being forwarded to the monitoring cluster, we need to make that any reconfiguration undertaken by the users is made with that in mind. Is there anything beyond mentioning this caveat in a deprecation message that might help?

Document reshaping

As described above, a monitoring cluster with Elasticsearch currently exposes a plugin which reshapes the document before it is written. With this change, incoming data will no longer be routed through that plugin and as a result, Kibana itself will need to send data which is shaped in a manner consistent with what's being done by the monitoring plugin today.

Below are documents which describe this change. (Thank you to @jakelandis for insight into this process):

Original sent by Kibana

I got this by sniffing traffic traffic and looking for HTTP POSTs to _monitoring/bulk. Below is one example.

{"index":{"_type":"kibana_stats"}}
 {
 	"kibana": {
 		"uuid": "5b2de169-2785-441b-ae8c-186a1936b17d",
 		"name": "Mikes-MacBook-Pro.local",
 		"index": ".kibana",
 		"host": "localhost",
 		"transport_address": "localhost:5603",
 		"version": "8.0.0",
 		"snapshot": false,
 		"status": "green"
 	},
 	"concurrent_connections": 0,
 	"os": {
 		"load": {
 			"1m": 3.55908203125,
 			"5m": 2.7646484375,
 			"15m": 2.5361328125
 		},
 		"memory": {
 			"total_in_bytes": 34359738368,
 			"free_in_bytes": 1129164800,
 			"used_in_bytes": 33230573568
 		},
 		"uptime_in_millis": 625188000,
 		"platform": "darwin",
 		"platformRelease": "darwin-18.7.0"
 	},
 	"process": {
 		"event_loop_delay": 0.6557079553604126,
 		"memory": {
 			"heap": {
 				"total_in_bytes": 392155136,
 				"used_in_bytes": 305605720,
 				"size_limit": 1526909922
 			},
 			"resident_set_size_in_bytes": 262111232
 		},
 		"uptime_in_millis": 848682
 	},
 	"requests": {
 		"disconnects": 0,
 		"total": 0
 	},
 	"response_times": {
 		"average": 0,
 		"max": 0
 	},
 	"timestamp": "2019-12-10T13:41:50.331Z"
 }

Same document after being reshaped by Elasticsearch

One can compare this to a document after it is indexed by Elasticsearch. Below is an example of an indexed document:

  {
  	"_index": ".monitoring-kibana-7-2019.12.10",
  	"_id": "Kqge724BaNJWkC7r0oEz",
  	"_score": 1.0,
  	"_source": {
  		"cluster_uuid": "nHfBak70Sy2QffyZG-2dyg",
  		"timestamp": "2019-12-10T09:23:24.848Z",
  		"interval_ms": 10000,
  		"type": "kibana_stats",
  		"source_node": {
  			"uuid": "VWzyO39IRbiuaEW8lALREg",
  			"host": "127.0.0.1",
  			"transport_address": "127.0.0.1:9300",
  			"ip": "127.0.0.1",
  			"name": "Mikes-MacBook-Pro.local",
  			"timestamp": "2019-12-10T09:23:24.848Z"
  		},
  		"kibana_stats": {
  			"kibana": {
  				"uuid": "5b2de169-2785-441b-ae8c-186a1936b17d",
  				"name": "Mikes-MacBook-Pro.local",
  				"index": ".kibana",
  				"host": "localhost",
  				"transport_address": "localhost:5603",
  				"version": "8.0.0",
  				"snapshot": false,
  				"status": "green"
  			},
  			"concurrent_connections": 0,
  			"os": {
  				"load": {
  					"1m": 2.85693359375,
  					"5m": 3.96240234375,
  					"15m": 3.4453125
  				},
  				"memory": {
  					"total_in_bytes": 34359738368,
  					"free_in_bytes": 626302976,
  					"used_in_bytes": 33733435392
  				},
  				"uptime_in_millis": 609682000,
  				"platform": "darwin",
  				"platformRelease": "darwin-18.7.0"
  			},
  			"process": {
  				"event_loop_delay": 0.783735990524292,
  				"memory": {
  					"heap": {
  						"total_in_bytes": 436719616,
  						"used_in_bytes": 316087208,
  						"size_limit": 1526909922
  					},
  					"resident_set_size_in_bytes": 820559872
  				},
  				"uptime_in_millis": 129449.99999999999
  			},
  			"requests": {
  				"disconnects": 0,
  				"total": 6
  			},
  			"response_times": {
  				"average": 61,
  				"max": 98
  			},
  			"timestamp": "2019-12-10T09:23:24.787Z"
  		}
  	}
  }

(Just to note, these aren't the same requests. In any comparison, be sure to account for any difference in field values.)

  1. Note the additional data structure at the root of the document after it has been indexed:
	{
  		"cluster_uuid": "nHfBak70Sy2QffyZG-2dyg",
  		"timestamp": "2019-12-10T09:23:24.848Z",
  		"interval_ms": 10000,
  		"type": "kibana_stats",
  		"source_node": {
  			"uuid": "VWzyO39IRbiuaEW8lALREg",
  			"host": "127.0.0.1",
  			"transport_address": "127.0.0.1:9300",
  			"ip": "127.0.0.1",
  			"name": "Mikes-MacBook-Pro.local",
  			"timestamp": "2019-12-10T09:23:24.848Z"
  		},
		...
  1. We will need to ensure that that the document indexed by Kibana has the correct bulk action header. Concretely, this means that you will need to exclude _type, and the index name will need to exclude the dot.

There is additional discussion on these points in a corresponding Logstash issue.

Stack co-ordination

This is a change that we would like to have happen in 7.7 across a variety of stack components. We would like to ask if all teams could have their corresponding PRs ready to be merged by March 1st, 2020 so that we can ensure that all changes are ready and there is not any inconsistency between stack components.

It is imperative that these changes not be merged into 7.6, however, because we may not be ready for them on the Stack Monitoring Kibana application end. (I will update this issue if that changes.)

@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring (Team:Monitoring)

@chrisronline
Copy link
Contributor

One potential hiccup here.

If we want to maintain configuration parity with the other stack products, we should use monitoring.* config keys. However, it looks like kibana x-pack plugins can only define configuration keys with an explicit prefix (which we already have defined). We can read any config we want, but it looks like we might need to take some extra steps for defining these keys.

It's worth verifying this functionality with the platform folks, and perhaps understanding if this is also true in new platform.

A couple options come to mind of how to handle this:

  1. Change the config keys to something under xpack.monitoring (but different than the existing xpack.monitoring keys)
  2. Create a plugin in OSS that is called monitoring and simply defines these keys and does nothing more.

@cachedout
Copy link
Contributor Author

@chrisronline Good find! I wasn't aware of that limitation. I'd really like it if we could maintain consistency across the products. Any recommendations of who we can talk to on the platform side who might be able to help? @tylersmalley can you help point us in the right direction here?

@tylersmalley
Copy link
Contributor

@joshdover is the xpack prefix enforced in the new platform?

@cachedout have the other teams removed the prefix for all "X-Pack" plugins? Or just a handful? I am all for consistency among the stack, but we should also have consistency within Kibana. So if we do this, we should do it in 8.0 across the board as a breaking change.

@cachedout
Copy link
Contributor Author

@tylersmalley I can't speak to whether or not teams have convention of an xpack prefix either by convention or enforced in code. Presently, we're following a change made on the Beats side so perhaps @ycombinator or @urso might be able to speak for that team?

@ycombinator
Copy link
Contributor

ycombinator commented Dec 13, 2019

I can give some history for why we added monitoring.elasticsearch.* in Beats.

Historically, Beats has had the xpack.monitoring.elasticsearch.* settings, just like in Logstash, to indicate to the user that this is the production cluster to which monitoring data for the Beat instance would be sent.

When we decided to implement sending the monitoring data directly from the Beat to the monitoring cluster, we needed a different set of elasticsearch settings (hosts, auth, TLS, etc.) to indicate to the user that this is the monitoring cluster to which monitoring data for the Beat instance would be sent. So we decided to introduce monitoring.elasticsearch.* for that purpose.

This new class of settings was introduced in 7.2.0. We kept xpack.monitoring.elasticsearch.* around, though, but deprecated it. This is because we wanted users to send the monitoring data directly to the monitoring cluster, to remove their reliance on the "legacy" monitoring approach of sending data through the production cluster, which will ultimately enable us to remove the custom monitoring bulk API endpoint from Elasticsearch in 8.0.0. At that time (8.0.0), we (Beats) also plan to remove the deprecated xpack.monitoring.* settings from it's configuration and the associated code path underneath.

FWIW, a similar change (adding a new configuration setting for shipping directly to the monitoring cluster) is being proposed for Logstash as well: elastic/logstash#11403.

@ycombinator
Copy link
Contributor

It also occurs to me that in Kibana's case, there already is a setting to indicate the monitoring Elasticsearch cluster and it is... xpack.monitoring.elasticsearch.*. Currently this setting is used for reading data from the monitoring cluster, not for writing Kibana's monitoring data to it. For the latter purpose, I believe the top-level elasticsearch.* settings are used (@chrisronline can you confirm?).

Not saying that necessarily changes anything about the discussion of introducing a monitoring.elasticsearch.* class of settings, but wanted to point this out for completeness.

@chrisronline
Copy link
Contributor

The changes for Kibana are a little bit different than Beats/Logstash. Like Beats/Logstash, Kibana has configuration like xpack.monitoring.*, but this isn't for the same purpose. For Kibana, these settings are used to create a read connection to the monitoring cluster directly, used by stack monitoring UI.

Just like Beats/Logstash, Kibana needs to know where to send its monitoring data, but there are no exclusive configurations for this purpose - we simply reuse the connection established with the elasticsearch.* settings. We write through a custom monitoring endpoint in ES, which is then "exported" to the monitoring cluster (through ES monitoring exporters, assuming a http exporter is configured).

So, currently, we do not have any Kibana configuration that allows users to say "write my monitoring data to this cluster".

This ticket is about adding that configuration, for feature parity with Beats/Logstash (PR pending) and as a way to ease the necessary migration of using Metricbeat as the collector/shipper of monitoring data.

It's not a breaking change - nothing is changing with the current configuration.

@smith
Copy link
Contributor

smith commented Feb 27, 2023

I don't think there's anything we need to do here.

@smith smith closed this as not planned Won't fix, can't repro, duplicate, stale Feb 27, 2023
@zube zube bot closed this as completed May 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants