Cherry-pick #14829 to 7.6: [Metricbeat] Add Google Cloud Platform module #15575

sayden · 2020-01-15T12:47:59Z

Cherry-pick of PR #14829 to 7.6 branch. Original message:

ONGOING work on docs bust most code is ready to go.

Seed PR for the Google Cloud Platform module for Metricbeat.

It includes the following:

Stackdriver metricset
Compute metricset based on Stackdriver as config based module

Ignore the following Metricsets which are already included in the PR for testing purposes but they are not going to be merged yet (they'll be removed before merging):

Storage
Firebase
Firestore
Loadbalancing
PubSub

Some vocabulary for people new to Google Cloud

You can find some translations for GCP services in AWS:

Stackdriver -> Cloudwatch
Compute -> EC2
PubSub -> SQS
Storage (GCS) -> S3
Firebase / Firestore -> ~DynamoDB
Bigquery -> ~Redshift+Athena

Labels / Metadata

You'll see lots of mentions to Metadata inside the code. This refers to two different entities within GCP: labels and metadata. For Elasticsearch purposes both can be considered metadata so whenever you read "label" or "metadata" it's going to be treated as the same thing at the end of the pipeline.

Grouping of events

The way that GCP labels metrics is somehow complex to generate "service based events". They export their metrics individually so you don't request "compute metrics" or "metrics of this compute instance" but instead you have to request "give all cpu_utilization values of compute instances" so a single response will bring one or more values per instance for a specified timeframe for all your instances. That's a single response.

For example, a request for CPU utilization can return (in pseudocode):

{
	"metadata": {
		"zone": "eu-central-1",
		"project": "project1"
    },
    "metric": "cpu_utilization",
	"points": [
		{
			"time": 1,
			"value": 2,
			"metadata": {
				"instance": "instance-1"
			}
		},
		{
			"time": 2,
			"value": 2,
			"metadata": {
				"instance": "instance-1"
			}
		}
	]
}

Then, a new call must be done to (in this example it will be Compute API) to request Instance metadata (like working group, network group, user labels or user metadata which is associated only to the instance and not to a particular metrics like CPU). Then you get data like this (again, in pseudocode)

{
    "instance":"instance-1",
    "metadata":{
        "user":{
            "key":"value"
        },
        "system":{
            "key":"value"
        }
    },
    ...
}

At the end, both response for that particular metric must be grouped into a single event that share some common metadata. For compute this includes instance_id and availability zone apart from timestamp. Each service requires an specifici implementation to get non-stackdriver metadata. The service metadata implementation is only developed for Compute at the moment and can be seen in googlecloud/stackdriver/compute, the rest of the services uses only metadata provided by Stackdriver.

ECS

Metadata returned from Stackdriver is ECS compliant for Compute metadata (mainly availability zone, account id and cloud provider, instance id and instance name). Some of the metadata might be written out of the ECS fields. More deployment configurations plus testing is needed find them all.

Modules

All services from https://cloud.google.com/monitoring/api/metrics_gcp can be added as more configuration. Tests until now shows no problem but their specific metadata must be developed separatedly for each of them.

Limitations

You cannot set period under 300s (you can right now, but it won't return any metric). I think it's some kind of limitation of Stackdriver because their metrics are sampled each 60 to 300 seconds.

Happy reviewing :)

Sorry for the big PR, it was impossible to make it smaller

Includes Stackdriver and Compute Metricset (cherry picked from commit 8be7745) # Conflicts: # NOTICE.txt # vendor/vendor.json

exekias

LGTM if CI is happy

[Metricbeat] Add Google Cloud Platform module (elastic#14829)

81ac67c

Includes Stackdriver and Compute Metricset (cherry picked from commit 8be7745) # Conflicts: # NOTICE.txt # vendor/vendor.json

sayden requested a review from a team as a code owner January 15, 2020 12:47

sayden added backport review labels Jan 15, 2020

sayden self-assigned this Jan 15, 2020

Update notice. Run mage fmt update

402af9f

exekias approved these changes Jan 15, 2020

View reviewed changes

exekias merged commit 5ae15df into elastic:7.6 Jan 15, 2020

sayden deleted the backport_14829_7.6 branch October 29, 2021 08:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cherry-pick #14829 to 7.6: [Metricbeat] Add Google Cloud Platform module #15575

Cherry-pick #14829 to 7.6: [Metricbeat] Add Google Cloud Platform module #15575

sayden commented Jan 15, 2020

exekias left a comment

Cherry-pick #14829 to 7.6: [Metricbeat] Add Google Cloud Platform module #15575

Cherry-pick #14829 to 7.6: [Metricbeat] Add Google Cloud Platform module #15575

Conversation

sayden commented Jan 15, 2020

Some vocabulary for people new to Google Cloud

Labels / Metadata

Grouping of events

ECS

Modules

Limitations

Happy reviewing :)

exekias left a comment

Choose a reason for hiding this comment