Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for optional cluster type labels to cluster deployment and cluster provision metrics #1925

Merged
merged 8 commits into from
Feb 17, 2023

Conversation

suhanime
Copy link
Contributor

@suhanime suhanime commented Dec 2, 2022

In addition to the cluster-type label, it would be helpful for users to be able to provide their own set of labels (like STS, Private Link, Managed VPC etc) to look for while logging metrics. Allow support for such optional labels for metrics of ClusterDeployment and ClusterProvision controller

xref: https://issues.redhat.com/browse/HIVE-2045

/assign @2uasimojo

@openshift-ci openshift-ci bot requested review from 2uasimojo and dlom December 2, 2022 17:22
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 2, 2022
@suhanime
Copy link
Contributor Author

suhanime commented Dec 2, 2022

/retest

Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Configs to hiveConfig can be made as following:

    "spec": {
        "logLevel": "debug",
        "metricsConfig": {
            "metricsWithClusterTypeLabels": {
                "managed_vpc": "api.openshift.com/managed-vpc",
                "private_link": "api.openshift.com/private-link",
                "sts": "api.openshift.com/sts"
            }
        },
        "targetNamespace": "hive"
    },

Screenshot of example metric to ensure it works well with and without optional labels being provided:
image

@2uasimojo ready for review

pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
@suhanime suhanime changed the title Add more cluster type labels to cluster deployment and cluster provision metrics Add support for optional cluster type labels to cluster deployment and cluster provision metrics Dec 14, 2022
@codecov
Copy link

codecov bot commented Dec 14, 2022

Codecov Report

Merging #1925 (c34936b) into master (a04d71c) will increase coverage by 0.20%.
The diff coverage is 84.69%.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1925      +/-   ##
==========================================
+ Coverage   40.50%   40.71%   +0.20%     
==========================================
  Files         366      367       +1     
  Lines       34090    34239     +149     
==========================================
+ Hits        13809    13939     +130     
- Misses      19077    19091      +14     
- Partials     1204     1209       +5     
Impacted Files Coverage Δ
pkg/constants/constants.go 100.00% <ø> (ø)
...shift/hive/apis/hive/v1/clusterdeployment_types.go 33.33% <ø> (ø)
...er/clusterprovision/clusterprovision_controller.go 48.53% <20.00%> (-0.94%) ⬇️
.../clusterdeployment/clusterdeployment_controller.go 62.47% <44.44%> (-0.22%) ⬇️
pkg/controller/metrics/metrics.go 33.51% <62.50%> (ø)
.../controller/metrics/metrics_with_dynamic_labels.go 84.61% <84.61%> (ø)
...kg/controller/clusterdeployment/clusterinstalls.go 75.23% <100.00%> (-0.12%) ⬇️
.../controller/clusterdeployment/clusterprovisions.go 61.21% <100.00%> (-0.09%) ⬇️
pkg/controller/clusterdeployment/metrics.go 96.87% <100.00%> (+6.39%) ⬆️
pkg/controller/clusterprovision/metrics.go 100.00% <100.00%> (ø)
... and 2 more

@suhanime
Copy link
Contributor Author

/retest

@2uasimojo
Copy link
Member

/test e2e

Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't delve into the low-level details yet -- can do that once we've agreed on the more major things discussed inline.

apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
pkg/controller/clusterprovision/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2uasimojo I think this addresses all of your comments. PTAL

pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
pkg/controller/clusterprovision/metrics.go Outdated Show resolved Hide resolved
apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics.go Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
pkg/controller/clusterdeployment/metrics.go Outdated Show resolved Hide resolved
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, will continue tomorrow.

apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another chunk of comments. Still not done.

pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, finally done! Sorry it took so long!

pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
Comment on lines 97 to 98
// Register the metrics to avoid panics during testing
registerMetrics(mapClusterTypeLabelToValue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So actually, only TestClusterDeploymentReconcile (the first suite) panics if metrics aren't registered.

and

You can register them via this call at the top of that function and it works fine.

so

If you intend to add any kind of UT around the metrics configuration/labeling, you would want to be able to tailor the config accordingly in that test -- which you could then do by passing in a custom map (or, if taking the suggestion above, a custom MetricsConfig).

If you're not planning on adding that UT now, perhaps just beef up the comment here so future-we don't have to rediscover the above.

That said, I think you could -- and should -- add some unit tests for this PR. This would entail generating mocks for the following interfaces:

  • Observer (for HistogramVec.Observe())
  • ObserverVec (for HistogramVec.CurryWith())
  • Counter (for CounterVec.Inc())
  • Aww, shiiit, really? CounterVec.CurryWith() isn't associated with an interface??

Welp, I guess you can't UT the CounterVecs 🤷 (I even checked whether a more recent version of the prom golang client lib might have added that interface, but nope. This seemed such a silly omission that I opened an issue for it.)

I mean, I guess we could write our own interface in the hive project. We do this e.g. for the AWS client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics do not need to be registered, but they have to be defined at least - because code never checks whether a metric is registered, it always attempts to observe it, and prometheus libraries handle the rest.
I've changed the wording a bit, and I do intend to implement the unit testing in a follow-up PR

pkg/controller/clusterdeployment/metrics.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2uasimojo Ready for review!

apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
docs/hive-metrics.md Outdated Show resolved Hide resolved
Comment on lines 97 to 98
// Register the metrics to avoid panics during testing
registerMetrics(mapClusterTypeLabelToValue)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The metrics do not need to be registered, but they have to be defined at least - because code never checks whether a metric is registered, it always attempts to observe it, and prometheus libraries handle the rest.
I've changed the wording a bit, and I do intend to implement the unit testing in a follow-up PR

Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you want to live chat about the threading concern.

apis/hive/v1/metricsconfig/metrics_config.go Outdated Show resolved Hide resolved
docs/hive_metrics.md Outdated Show resolved Hide resolved
docs/hive_metrics.md Outdated Show resolved Hide resolved
docs/hive_metrics.md Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/clusterdeployment/metrics.go Outdated Show resolved Hide resolved
@suhanime suhanime force-pushed the HIVE-2045 branch 2 times, most recently from d210af8 to ab05f82 Compare February 2, 2023 01:24
Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2uasimojo Ready for review!

PS: I squashed some of the commits, but I did push my changes before the squash - for ease of review

pkg/controller/clusterdeployment/metrics.go Outdated Show resolved Hide resolved
apis/hive/v1/metricsconfig/metricsConfig.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metricsWithDynamicLabels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
@2uasimojo
Copy link
Member

/test e2e

...while I review this

Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very, very close.

I was gonna say we can land it as is and address my comments in a followon... but the pointer-to-the-map thing would actually affect the API, so we have to get it right out of the gate.

pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
apis/hive/v1/metricsconfig/metrics_config.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2uasimojo Ready for review!

apis/hive/v1/metricsconfig/metrics_config.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
}
}

// getLabelList combines fixed and optional labels and returns the list without duplicates.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While doing this, I thought about how it would be easier to have an early feedback - if a user tries to cause an overlap between these labels. However, there is not simpler way to do it, and we wouldn't want to error out because of metrics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so a couple things:

  • I think we should do this check. We could do it during Register(), or we could do it from within New*VecWithDynamicLabels() itself.
  • ...and I think we should panic if there is an overlap (MustRegister() panics if something is wrong, so it fits the pattern).
  • Since it is going to be possible for a hive upgrade to trigger this panic (we add a fixed label that the user already had configured) we should make the message very clear: "HiveConfig.Spec.AdditionalClusterDeploymentLabels[%q] conflicts with a fixed label for metric %s. Please rename your label." kind of thing.
  • You may want to use the convenient Set class provided by apimachinery rather than rolling your own via map[string]bool.

@suhanime
Copy link
Contributor Author

suhanime commented Feb 9, 2023

/retest

Copy link
Member

@2uasimojo 2uasimojo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lovely iteration. I think we're really getting somewhere with the design of the objects/methods.

Most of my comments suggest refactoring, consolidation, simplification, cleanup. However, the couple of things still holding this from merging are:

  • We need to decide whether we ever allow the empty string as a label value. You have previously asserted that that somehow risks defining a metric with too few labels, and we agreed to always default.
  • I like the idea of panicking if the user tries to configure an optional label with the same key as a fixed one.

See inline for details.

Thanks!

}
}

// getLabelList combines fixed and optional labels and returns the list without duplicates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so a couple things:

  • I think we should do this check. We could do it during Register(), or we could do it from within New*VecWithDynamicLabels() itself.
  • ...and I think we should panic if there is an overlap (MustRegister() panics if something is wrong, so it fits the pattern).
  • Since it is going to be possible for a hive upgrade to trigger this panic (we add a fixed label that the user already had configured) we should make the message very clear: "HiveConfig.Spec.AdditionalClusterDeploymentLabels[%q] conflicts with a fixed label for metric %s. Please rename your label." kind of thing.
  • You may want to use the convenient Set class provided by apimachinery rather than rolling your own via map[string]bool.

pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
Comment on lines 93 to 91
func NewCounterVecWithDynamicLabels(counterOpts *prometheus.CounterOpts,
labelOptions ...func(labels *dynamicLabels)) *CounterVecWithDynamicLabels {
counterVecMetric := &CounterVecWithDynamicLabels{
CounterOpts: counterOpts,
dynamicLabels: &dynamicLabels{},
}
for _, o := range labelOptions {
o(counterVecMetric.dynamicLabels)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paradigm seems overcomplicated to me. Why not just

Suggested change
func NewCounterVecWithDynamicLabels(counterOpts *prometheus.CounterOpts,
labelOptions ...func(labels *dynamicLabels)) *CounterVecWithDynamicLabels {
counterVecMetric := &CounterVecWithDynamicLabels{
CounterOpts: counterOpts,
dynamicLabels: &dynamicLabels{},
}
for _, o := range labelOptions {
o(counterVecMetric.dynamicLabels)
}
func NewCounterVecWithDynamicLabels(counterOpts *prometheus.CounterOpts,
fixedLabels []string, optionalLabels map[string]string) *CounterVecWithDynamicLabels {
counterVecMetric := &CounterVecWithDynamicLabels{
CounterOpts: counterOpts,
dynamicLabels: &dynamicLabels{
fixedLabels: fixedLabels,
optionalLabels: optionalLabels,
},
}

Then we don't need the With*() functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - didn't you want the Opts pattern here so we can allow for future expansion of this interface if needed? Or did I just completely misunderstand why Opts pattern was needed at all?
Nevertheless, I'm reverting to the original way which you suggest here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the original ask, you only asked for more of a CounterOpts and HistogramOpts, and when I was learning about the Opts pattern and how to implement it, I decided to allow for future expansion by implementing a more flexible option. Anyways, I'm over-explaining here. Implemented your change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see. I hadn't been thinking about the options pattern. Now that I've looked at it in that light, I get the advantages. In particular, if we wanted to add fields to our fancy structure, we would have to add arguments to the constructor and then change all the calls to it. In this case, short term loss (more, and more complicated, code) but long term gain (easier to enhance in the future). If you want to put it back in a fup, I'm down. But for now this is good.

pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
pkg/controller/metrics/metrics_with_dynamic_labels.go Outdated Show resolved Hide resolved
This commit adds labels to indicate if cluster is of type STS,
Private-Link or Managed-VPC for ClusterProvision controller metrics
This commit adds labels to indicate if cluster is of type STS,
Private-Link or Managed-VPC for ClusterDeployment controller metrics.
These labels are added only to those metrics which already report
"cluster_type" label.
Allow users to add more cluster-type labels for metrics logging. These
labels should be applied to all ClusterDeployment and ClusterProvision
metrics that are logged per cluster-type. Optional labels can be
provided via metricsConfig section in the hiveConfig. Users can name the
label on the metric and assign it to a corresponding clusterDeployment
(or provision) label.

- While working on these changes, 2 metrics from ClusterProvision
  controller (metricClusterProvisionsTotal and metricInstallErrors) were
  moved to clusterProvision\metrics to collate all the relevant metrics
  and support was added for optional labels.
- Registering of metrics was moved later instead of init() time because
  the latter is executed before we could read the hiveConfig.
  Consequently the metrics needed to be explicitly registered for unit
  test purposes to avoid nil panics
- Create a new interface for metrics with dynamic/optional labels and
  implement observing methods to simplify declaring and logging of the
metric
- Make cluster_type label a part of optional label logic, so it can
  eventually be made opt-in via hiveconfig
- Rename MetricsWithClusterTypeLabels to AdditionalClusterDeploymentLabels
- Always add optional labels (while currying) in sorted order
- Remove DefaultClusterType constant from clusterdeployment_types
  because it was only used at one place that was removed as a part of
this commit
- Add documentation for metrics

- Rename files and package names as per Golang naming conventions
- Add Hive Operator metrics in documentation

- Add omitempty to metricsConfig fields
- Set cluster_type only if config is empty
- Ensure metric writes/observes are protected from race conditions
This commit implements some of the changes recommended based on PR
comments. Main change is applying Opts pattern to the interface for
metrics with dynamic labels and apply a buildLabels helper to ensure
there is no overlap between fixed and optional labels as we define and
observe the corresponding metrics.

A side effect to the buildLabels helper is that we no longer need to
build a curried vector for a metric before observing it - we instead
build the map for labels to value and observe them via `With` (or
`GetMetricWith`) method, which is similar to how we observe metrics
without optional label support.
Copy link
Contributor Author

@suhanime suhanime left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@2uasimojo Ready for review

Comment on lines 93 to 91
func NewCounterVecWithDynamicLabels(counterOpts *prometheus.CounterOpts,
labelOptions ...func(labels *dynamicLabels)) *CounterVecWithDynamicLabels {
counterVecMetric := &CounterVecWithDynamicLabels{
CounterOpts: counterOpts,
dynamicLabels: &dynamicLabels{},
}
for _, o := range labelOptions {
o(counterVecMetric.dynamicLabels)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - didn't you want the Opts pattern here so we can allow for future expansion of this interface if needed? Or did I just completely misunderstand why Opts pattern was needed at all?
Nevertheless, I'm reverting to the original way which you suggest here.

Comment on lines 93 to 91
func NewCounterVecWithDynamicLabels(counterOpts *prometheus.CounterOpts,
labelOptions ...func(labels *dynamicLabels)) *CounterVecWithDynamicLabels {
counterVecMetric := &CounterVecWithDynamicLabels{
CounterOpts: counterOpts,
dynamicLabels: &dynamicLabels{},
}
for _, o := range labelOptions {
o(counterVecMetric.dynamicLabels)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the original ask, you only asked for more of a CounterOpts and HistogramOpts, and when I was learning about the Opts pattern and how to implement it, I decided to allow for future expansion by implementing a more flexible option. Anyways, I'm over-explaining here. Implemented your change.

if typeStr, ok := obj.GetLabels()[hivev1.HiveClusterTypeLabel]; ok {
// GetLabelValue returns the value of the label if set, otherwise a default value.
func GetLabelValue(obj metav1.Object, label string) string {
if typeStr, ok := obj.GetLabels()[label]; ok {
Copy link
Contributor Author

@suhanime suhanime Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so I do not remember when I discovered a problem - I distinctly remember us debating what a default value should be (like N/A or unknown). Nevertheless, I decided to track down the actual problem in the library because just 1 person remembering something like this is not ideal for future-proofing.
Okay, so With or WithLabelValues internally calls GetMetricWith from vec.go, and the problem comes when getOrCreateMetricWithLabels creates the map with size equivalent to labels passed instead of reading from metric.Desc to determine the correct size (Note - even the extractLabelValues only relies on labels provided)

TL;DR
When the metric is created, it relies on the labels provided instead of metric definition as the dimension for creating the map. So if you provide empty string as label, it ignores it and creates the resultant metric map with less dimensions, consequently leading to always dropping some labels while observing.
Maybe With might not lead to such a problem - WithLabelValues definitely does. But if we know that happens, we don't want to treat them differently as a safety.

PS: These are the closest relates issues/discussions I found related to this topic:
variable labels
missing labels
flexible metric exposition

if typeStr, ok := obj.GetLabels()[hivev1.HiveClusterTypeLabel]; ok {
// GetLabelValue returns the value of the label if set, otherwise a default value.
func GetLabelValue(obj metav1.Object, label string) string {
if typeStr, ok := obj.GetLabels()[label]; ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay; so then my comment still applies. If the label exists but happens to have an empty value, this code as written will cause it to be observed with an empty value rather than our default.

Now, based on the code you pointed to, it seems to me like that would be okay, because we would actually be passing in the empty string as an argument, as opposed to omitting the argument entirely, so the number of dimensions would still be correct. And if that's true, I would argue it's the appropriate thing to do: i.e. there's a difference in behavior for the user between the CR lacking the label ("unspecified") and the label existing with an empty value (the value is specified, and it's "").

Whichever way we decide to do it, it should be consistent. Right now the fixed labels always convert "" to the default, but the dynamic ones don't.

},
}
if repeatedLabel := counterVecMetric.getRepeatedLabel(); repeatedLabel != "" {
log.Errorf("HiveConfig.Spec.AdditionalClusterDeploymentLabels[%q] conflicts with a fixed label for the metric %s. Please rename your label.", repeatedLabel, counterOpts.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to make this fatal, right?. (And if you turn this into a panic, you don't need the log arg.)

return finalList
}

// getRepeatedLabel detects and returns the first optional label it encounters, that overlaps with another label
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This would be more useful if it returned all repeated labels so the consumer doesn't have to iterate. Can be done in a fup though.

@2uasimojo
Copy link
Member

Beautiful!

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 17, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 2uasimojo, suhanime

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD a04d71c and 2 for PR HEAD c34936b in total

@2uasimojo
Copy link
Member

/retest

1 similar comment
@suhanime
Copy link
Contributor Author

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2023

@suhanime: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit e2fccdf into openshift:master Feb 17, 2023
@suhanime suhanime deleted the HIVE-2045 branch February 17, 2023 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants