Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Prometheus metric mappings #9948

Merged
merged 15 commits into from
Jan 25, 2019
Merged

Conversation

exekias
Copy link
Contributor

@exekias exekias commented Jan 8, 2019

This change refactors the way we store Prometheus metrics
to avoid the need for user defined mappings.

Metrics will be stored under fields with their original names, the sames
that users see in Prometheus. All metric fields will use double type,
which is the internal store type in Prometheus.

All labels are stored under prometheus.labels.* and mapped as keyword.

As it was already happening, metrics are grouped when they share the same exact
set of labels. Only difference is that quantile (summaries) and le (histograms) labels
have been put back in labels, that will end up in some more documents when
these metric types are present.

As fields/labels mappings cannot longer collide, we can get rid of
namespacing, which could lead to mapping explosion. Users can still use
custom fields to differentiate namespaces.

Stats metricset has been removed as it is no longer necessary, the
module can now scrape these metrics from Prometheus server to correctly
mapped fields.

Example output:

{
    "@timestamp": "2017-10-12T08:05:34.853Z",
    "agent": {
        "hostname": "host.example.com",
        "name": "host.example.com"
    },
    "event": {
        "dataset": "prometheus.collector",
        "duration": 115000,
        "module": "prometheus"
    },
    "metricset": {
        "name": "collector"
    },
    "prometheus": {
        "metrics": {
            "http_request_duration_microseconds_count": 0,
            "http_request_duration_microseconds_sum": 0,
            "http_request_size_bytes_count": 0,
            "http_request_size_bytes_sum": 0,
            "http_response_size_bytes_count": 0,
            "http_response_size_bytes_sum": 0
        },
        "labels": {
            "handler": "drop_series"
        }
    },
    "service": {
        "address": "172.29.0.2:9090",
        "type": "prometheus"
    }
}

closes #9921

cc @elastic/infrastructure

This change refactors the way we store Prometheus metrics
to avoid the need for user defined mappings.

Metrics will be stored under fields with their original names, the sames
that users see in Prometheus. All metric fields will use `double` type,
which is the internal store type in Prometheus.

All labels are stored under `prometheus.labels.*` and mapped as keyword.

As fields/labels mappings cannot longer collide, we can get rid of
namespacing, which could lead to mapping explosion. Users can still use
custom fields to differentiate namespaces.

Stats metricset has been removed as it is no longer necessary, the
module can now scrape these metrics from Prometheus server to correctly
mapped fields.
@exekias exekias added enhancement in progress Pull request is currently in progress. module discuss Issue needs further discussion. Metricbeat Metricbeat breaking change labels Jan 8, 2019
@exekias exekias requested review from a team as code owners January 8, 2019 12:50
}

func GetPromEventsFromMetricFamily(mf *dto.MetricFamily) []PromEvent {
func (p *PromEvent) LabelsHash() string {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method PromEvent.LabelsHash should have comment or be unexported

}, nil
}

func (m *MetricSet) Fetch() ([]common.MapStr, error) {
func (m *MetricSet) Fetch(reporter mb.ReporterV2) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported method MetricSet.Fetch should have comment or be unexported

@urso urso removed the request for review from a team January 8, 2019 17:31
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this 👍

Will we keep the mappings for the metricsets based in this one like the kubernetes ones?

Prometheus metric labels
- name: prometheus.*
type: object
object_type: double
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

short_config: false
release: ga
settings: ["ssl", "http"]
fields:
- name: prometheus
type: group
# Order is important here, labels will match first, the rest are double
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about placing labels in the base level to avoid possible issues with this? This is also the recommendation in ECS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep it like this for now, main reasons are:

  • It seems there is no way to do alias on labels.*
  • If other modules push labels there, it could lead to mapping issues if they are using dots

I want to make sure this module won't end on mapping issues and we can use these metrics reliably, not against this change once the problem is gone (for instance, if/when index per module is implemented).

WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I wonder some things (not blocking this PR)

It seems there is no way to do alias on labels.*

Why would we want to do it here?

If other modules push labels there, it could lead to mapping issues if they are using dots

I guess this is no different with any other module, maybe we shouldn't have a field for labels in ECS till we don't have a complete solution for these mapping issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is no different with any other module, maybe we shouldn't have a field for labels in ECS till we don't have a complete solution for these mapping issues.

That could make sense, I still have some doubts on how ECS will play out in cases like this, I would prefer to play it safe on this module and learn from the process with others 😇

cfgwarn.Beta("The prometheus collector metricset is beta")

config := struct {
Namespace string `config:"namespace" validate:"required"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@exekias exekias force-pushed the refactor-prometheus branch from aaec74f to 411cc77 Compare January 9, 2019 09:54
@exekias
Copy link
Contributor Author

exekias commented Jan 9, 2019

About existing modules based on the Prometheus helper, this change should not affect them

@ruflin
Copy link
Contributor

ruflin commented Jan 9, 2019

Can you elaborate on why this makes the stats metricset obsolete. I think for cases where we know the exact data structure we should still allow structured data?

@exekias
Copy link
Contributor Author

exekias commented Jan 9, 2019

stats metricset queries Prometheus server for its own metrics, it only retrieves 2 simple metrics. From what I see in our forums, many people get confused by the two metricsets, when normally what they want is the collector.

The collector will give you the two metrics anyway, so I'm inclined towards removing that complexity

@jsoriano
Copy link
Member

What will be the migration path for someone using current module?

@exekias
Copy link
Contributor Author

exekias commented Jan 10, 2019

I didn't plan to offer a migration plan, as it's a breaking change. Users should be able to still consume both old and new metrics, they will just need different queries. Do you think we need more?

@ruflin
Copy link
Contributor

ruflin commented Jan 11, 2019

@exekias I think the more general problem here is that we mix modules with the collection of data. stats for me is still a valid metricset as part of a module, collector is kind of a hack that we introduced as a metricset :-) When we compare this to Filebeat, this is more like an input.

Moving forward I think there are 2 options:

  1. We keep treating inputs and modules/metricsets as the same thing. We have this also in http and others
  2. We introduce the concept of an input.

I'm in favor of option 2 as I think it will remove the confusion that there are modules which are not really modules but not sure we can implement something like this on the short term.

@roncohen
Copy link
Contributor

+1 on separating inputs and modules for metricbeat. That being said, this PR is a step in the right direction. Would be interesting to see if we can make the input/module separation for 7.0 as it seems it would be a breaking change.

@exekias exekias removed the discuss Issue needs further discussion. label Jan 14, 2019
@exekias
Copy link
Contributor Author

exekias commented Jan 14, 2019

Thank you everyone for your feedback, I understand we are good with the proposed change, I will go ahead and add missing docs/examples. Will ping you back when it's ready for review

Carlos Pérez-Aradros Herce added 2 commits January 18, 2019 16:52
@exekias exekias force-pushed the refactor-prometheus branch from ab7a29f to ec1feed Compare January 22, 2019 14:21
@exekias
Copy link
Contributor Author

exekias commented Jan 22, 2019

docs updated, this is ready for a second round of review

@exekias exekias force-pushed the refactor-prometheus branch from 906f069 to 229c108 Compare January 22, 2019 14:49
Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments. Overall LGTM. Still not a big fan of removing the the stats metricset but we can add it later again if we see requests for it.

@@ -5,11 +5,9 @@ This file is generated! See scripts/docs_collector.py
[[metricbeat-module-prometheus]]
== Prometheus module

This module periodically fetches metrics from
This module periodically scrapes metrics from
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you remove the stats metricset, we could already be here more specific that it scrapes metrics from prometheus exporters?

"prometheus_sd_kubernetes_events_total": {
"value": 0
}
"http_request_duration_microseconds_count": 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative here would have been prometheus.metrics.* to not conflict with labels. Will also allow us to add more metadata in the future.

},
"service": {
"address": "172.29.0.2:9090",
"type": "prometheus"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting one. We should probably make it configurable so the user can set service.type and service.name.

[source,yaml]
-------------------------------------------------------------------------------------
metricbeat.modules:
- module: prometheus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of like a second "flawor" of the metricset. In the future we potentially could add a simple config federate.enabled: true which will set the correct settings directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, it would be a nice addition, I agree it can be added later

@ruflin
Copy link
Contributor

ruflin commented Jan 24, 2019

@exekias Will need a make update.

@exekias
Copy link
Contributor Author

exekias commented Jan 25, 2019

@ruflin I think this one is ready to go?

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in. We can still do minor changes in follow up PR's.

@@ -69,6 +69,8 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d

*Metricbeat*

- Refactor Prometheus metric mappings {pull}9948[9948]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs a note, that the stat metricset was removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Prometheus metrics storage
5 participants