Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically enrich Kubernetes module events #7470

Merged
merged 29 commits into from
Jul 13, 2018

Conversation

exekias
Copy link
Contributor

@exekias exekias commented Jun 28, 2018

This PR adds automatically enriching of kubernetes module metricsets. It will behave as add_kubernetes_metadata, but enrich all events coming out of this module by default. This will not only add labels annotations to Pods but any other Resource, like nodes, containers, deployments...

It will be on by default, some configurations are allowed:

- module: kubernetes
  metricsets:
    - pod
    - node
    ...
  include_labels:
    - tier
  include_annotations:
    - ...

Closes #7148

TODO:

  • container
  • state_container
  • state_deployment
  • state_statefulset
  • state_replicaset
  • update docs
  • add more tests
  • extend metricbeat service account permissions

@exekias exekias added enhancement in progress Pull request is currently in progress. Metricbeat Metricbeat containers Related to containers use case labels Jun 28, 2018
return str
}

func BuildMetadataEnricher(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function BuildMetadataEnricher should have comment or be unexported

return kubernetes.NewWatcher(client, resource, options)
}

func NewResourceMetadataEnricher(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewResourceMetadataEnricher should have comment or be unexported

// PodMetadata generates metadata for the given pod taking to account certain filters
PodMetadata(pod *Pod) common.MapStr

// Containermetadata generates metadata for the given container of a pod
ContainerMetadata(pod *Pod, container string) common.MapStr
}

type metaGenerator struct {
// Config for MetaGenerator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment on exported type MetaGeneratorConfig should be of the form "MetaGeneratorConfig ..." (with optional leading article)

@exekias exekias force-pushed the kubernetes-module-metadata branch 2 times, most recently from a4feab7 to a5db15b Compare June 29, 2018 11:52
return strings.Join(fields, ":")
}

func BuildMetadataEnricher(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function BuildMetadataEnricher should have comment or be unexported

return enricher
}

func NewContainerMetadataEnricher(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exported function NewContainerMetadataEnricher should have comment or be unexported

@exekias exekias force-pushed the kubernetes-module-metadata branch 3 times, most recently from d79ab0b to ac1c226 Compare June 29, 2018 15:23
@exekias exekias added the review label Jun 29, 2018
@exekias exekias force-pushed the kubernetes-module-metadata branch 3 times, most recently from 5f2e824 to 1513b76 Compare July 3, 2018 10:05
@exekias exekias requested review from jsoriano and ruflin July 4, 2018 09:48
@exekias exekias removed the in progress Pull request is currently in progress. label Jul 4, 2018
@exekias exekias force-pushed the kubernetes-module-metadata branch from 2ac2ea6 to 18258a9 Compare July 4, 2018 10:45
@@ -6890,12 +6890,12 @@ Kubernetes pod name

--

*`kubernetes.pod.uid`*::
*`kubernetes.uid`*::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change? should we keep both fields till 7.0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubernetes.pod.uid is still unreleased, so this change should be ok

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh ok :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it's not released yet, then my comment above may apply :-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education: This is unique for 1 kubernetes "cluster". Is there also such a thing as a unique pod it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UIDs are assigned to all resources, by implementation, they are UUIDs. I struggled to either put them under kubernetes.uid or create kubernetes.*.uid for all resources. My reasoning was towards using the global namespace, as we do with labels or annotations. On the other hand, fields like name are not globally unique, hence pod.name makes sense for sure.

I'm open for opinions here, as this is semantically tricky

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can there be cases where more then one of the id's from 2 different resources is in the same event? For example having the pod.uid and the kubernetes.uid which describes the overall cluster in one event? If yes, I would probably put each uid under it's resource. Also correlation wise it probably only makes sense to correlate pod.uid but not correlated it with any other resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That case can happen, since #7231

But same would apply to labels/annotations. There is always a main resource for the event, and we put the labels & annotations for that one. Would it make sense to move those under pod.labels?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I so far assume labels is a more generic thing and similar labels apply to different resources. A query could be "give me all resources with label a". Based on this assumption I think the current handling of labels and tags make still sense. But if the labels don't overlap like in the case of the uid the above would seem to make more sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed b6d8f76 to move it back to pod.uid

IncludeLabels []string `config:"include_labels"`
ExcludeLabels []string `config:"exclude_labels"`
IncludeAnnotations []string `config:"include_annotations"`
IncludePodUID bool `config:"include_pod_uid"`
IncludeUID bool `config:"include_uid"`
IncludeCreatorMetadata bool `config:"include_creator_metadata"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these config settings documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I struggle a bit with these, I wonder if that means we should just hardcode them in. They were kept just in case someone wants to disable them, but I'm not sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it currently disabled by default?

As this doesn't look so costly to collect, maybe we can just add them, and they can be removed with drop_fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so far I have left them undocumented to see how much pushback they get once released, I can leave a note here to say we want to remove these settings


# Enriching parameters:
add_metadata: true
in_cluster: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out or removed here as this is the default?

@@ -82,5 +89,7 @@ func (m *MetricSet) Fetch() (common.MapStr, error) {
return nil, err
}

m.enricher.Enrich([]common.MapStr{event})

Copy link
Member

@jsoriano jsoriano Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With current abstractions, metric sets have basically the same Fetch (and New) implementation for all resources, maybe we could have an only metricset for them, parametrized with the eventMapper and the enricher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I had the same feeling, I think it should be done in a follow up PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps time to also convert it to v2 API.

}
m.watcherStarted = true
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a Stop() to release the watcher resources?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, thank you for bringing this in, it was in my TODO 👍

"github.com/elastic/beats/metricbeat/mb"
)

var nilEnricher = &enricher{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could define a nilEnricher type that implements the interface with empty methods, to avoid needing the if e == nilEnricher in all methods of enricher.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed

func(m map[string]common.MapStr, r kubernetes.Resource) {
pod := r.(*kubernetes.Pod)
for _, container := range append(pod.GetSpec().GetContainers(), pod.GetSpec().GetInitContainers()...) {
id := join(r.GetMetadata().GetNamespace(), r.GetMetadata().GetName(), container.GetName())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also have a struct type for the ids, so the map is map[resourceID]common.MapStr, and id := resourceID{r.GetMetadata().GetNamespace(), ....
I have had bad experiences in the past building map keys by joining strings 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say in general I can do the same mistakes with both approaches, join method is syntactically similar to your resourceID, in both cases I can pass parameters in the wrong order.

@exekias
Copy link
Contributor Author

exekias commented Jul 6, 2018

Thank you for the review @jsoriano!! It should be ready for a second pass. I'll wait for that before rebasing master

@exekias exekias force-pushed the kubernetes-module-metadata branch from 4e9a754 to 3c74a22 Compare July 6, 2018 08:31
Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@exekias exekias force-pushed the kubernetes-module-metadata branch from 3c74a22 to 7d3d613 Compare July 6, 2018 08:43
@exekias
Copy link
Contributor Author

exekias commented Jul 6, 2018

👍 rebased master

"node": common.MapStr{"name": "test"},
"labels": common.MapStr{"a": common.MapStr{"value": "bar", "key": "foo"}},
"pod": common.MapStr{"name": ""},
"uid": "005f3b90-4b9d-12f8-acf0-31020a840133",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As uid has be already around before, we should not change anything here so this is only a note from my side.

I often use id instead of uuid or uid for consistency. And as long as there is only one id I expect it to be unique.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UID is a well-known field in Kubernetes, changing it to id could confuse some users IMO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few questions. Most can also be addressed in a follow up PR.

@@ -6890,12 +6890,12 @@ Kubernetes pod name

--

*`kubernetes.pod.uid`*::
*`kubernetes.uid`*::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my education: This is unique for 1 kubernetes "cluster". Is there also such a thing as a unique pod it?

return events, nil
}

// Close stops this metricset
func (m *MetricSet) Close() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test if Close is called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -82,5 +89,7 @@ func (m *MetricSet) Fetch() (common.MapStr, error) {
return nil, err
}

m.enricher.Enrich([]common.MapStr{event})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps time to also convert it to v2 API.

return kubernetes.NewWatcher(client, resource, options)
}

// NewResourceMetadataEnricher returns a Enricher configured for kubernetes resource events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: an Enricher


watcher, err := GetWatcher(base, resource, nodeScope)
if err != nil {
logp.Warn("Error initializing Kubernetes metadata enricher: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer not to use Warn but either Info or Error. Seems more like Info in this case as even if the enricher fails, the metricset still starts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It starts, but something is wrong. I used warn to call user attention on this, as it's unexpected. Can switch to info or error if you prefer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried to get rid of all Warn messages in the past. The reason is that for Warn it's not clear if users have to take again or not. It sounds like in the above case they have to take again so probably move it to error.


metaConfig := kubernetes.MetaGeneratorConfig{}
if err := base.Module().UnpackConfig(&metaConfig); err != nil {
logp.Warn("Error initializing Kubernetes metadata enricher: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above.


watcher, err := GetWatcher(base, &kubernetes.Pod{}, nodeScope)
if err != nil {
logp.Warn("Error initializing Kubernetes metadata enricher: %s", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above, applies also further below.

@@ -28,6 +37,8 @@
# - state_container
# period: 10s
# hosts: ["kube-state-metrics:8080"]
# add_metadata: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not introduced in this PR, but I think commenting with # should be smae as first part of config (line 16 etc).

@exekias
Copy link
Contributor Author

exekias commented Jul 11, 2018

@ruflin let me know if this one is good to go, I can rebase again once reviewed

}, nil
}

// Fetch methods implements the data gathering and data conversion to the right format
// It returns the event which is then forward to the output. In case of an error, a
// descriptive error must be returned.
func (m *MetricSet) Fetch() ([]common.MapStr, error) {
m.enricher.Start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be moved to New instead of start. Like this you would not needed the bool isStarted inside start and it would be called only once.

Also I'm a bit worried about potential race conditions for the isStarted variable. What if Fetch() and Stop are called almost at the same time? Or Fetch is for some reason called twice in a very short period?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving it to new is not probably a bad option, some processes instantiate a module without launching it. I will add a mutex for the variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What modules are these?

}

func (m *enricher) Start() {
if !m.watcherStarted {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think combined with Stop watcherStarted could lead to a race condition.

@exekias exekias force-pushed the kubernetes-module-metadata branch from 96125fd to c282e67 Compare July 12, 2018 10:01
@exekias exekias force-pushed the kubernetes-module-metadata branch from c282e67 to d582e8b Compare July 12, 2018 10:04
@exekias
Copy link
Contributor Author

exekias commented Jul 13, 2018

Should be ready for another look

}, nil
}

// Fetch methods implements the data gathering and data conversion to the right format
// It returns the event which is then forward to the output. In case of an error, a
// descriptive error must be returned.
func (m *MetricSet) Fetch() ([]common.MapStr, error) {
m.enricher.Start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What modules are these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case enhancement Metricbeat Metricbeat review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants