Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#342 - Implemented Label Value Sanitizer #776

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Tigraine
Copy link

@brian-brazil as discussed (a few years ago 🤣 ) here #342

This PR adds the ability to all SimpleCollector instances to manipulate labels before they are processed.
This allows users to extend a label to be runtime safe or do manipulation like removing credentials etc..

Example would be:

String clientName = getClientName(); // this label is not null-safe
counter.label(clientName).inc() <-- will throw IllegalArgumentException only during runtime

Usage:

metric = Gauge.build()
            .name("nonulllabels")
            .help("help")
            .labelNames("l")
            .labelValueSanitizer(Gauge.TransformNullLabelsToEmptyString())
            .register(registry);

Instead of having to sanitise all labels before calling the instrumentation code we can make the code safe at setup and not care about downstream mis-use.
Also since this is implemented through an interface additional use cases can be passed by consumers of the library.

Copy link
Member

@fstab fstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR. I like the idea of having a generic way to transform labels. I added a few thoughts as review comments. Let me know what you think.

@@ -0,0 +1,5 @@
package io.prometheus.client;

public interface LabelValueSanitizer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you say this can also be used for other use cases, not just "sanitization". So maybe the interface should have a more generic name covering all use cases, like LabelValueTransformer?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @fstab ... LabelValueTransformer seems like it could be used more generically.

Copy link
Collaborator

@dhoard dhoard Jul 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be future use cases/places where the paradigm could be used to transform other (non-label) String values, so I feel it would make more sense to rename the interface to StringTransformer.

package io.prometheus.client;

public interface LabelValueSanitizer {
String[] sanitize(String... labelValue);
Copy link
Member

@fstab fstab May 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to sanitize only a single label value, not the whole array. For example, you could register a sanitizer like this:

metric = Gauge.build()
        .name("nonulllabels")
        .help("help")
        .labelNames("path", "status")
        .labelValueSanitizer("path", myPathSanitizer)
        .register(registry);

and then the myPathSanitizer would only be called for the path label value, but not for an array of all label values.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since labelNames takes a String array, I think it makes sense for the class to manipulate the labels also takes a String array.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what use case you have in mind.

What I have in mind is that you have a specific label that you want to transform (for example stripping a unique user ID from an HTTP path). For that, it would be good to implement a specific pathTransformer and apply it to the path label. Advantage is: You don't need to magically know at which position in the array the path label is. And if path is passed as a label to multiple different metrics, you can reuse the transformer, even if the path is on different positions for each metric.

The array as parameter only makes sense if you want to do exactly the same transformation for all labels. This would work for replacing null with "null", but for every other use case I think it's better if you know which label you want to transform.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both might actually be interesting.
Maybe having a Transformer be applied to a specific label or * to apply to all?
I could for example think of a case where you'd want to have your infrastructure automatically prevent users from using excessively long labels (truncating them automatically).

Comment on lines +198 to +214
/**
* Transforms null labels to an empty string to guard against IllegalArgument runtime exceptions
*/
static LabelValueSanitizer TransformNullLabelsToEmptyString() {
return new LabelValueSanitizer() {
@Override
public String[] sanitize(String... labelValue) {
for (int i = 0; i < labelValue.length; i++) {
if (labelValue[i] == null) {
labelValue[i] = "";
}
}
return labelValue;
}
};
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the sanitizer takes only a single string as argument instead of an array of strings, a good handler for null values would be String::valueOf. As this already exists in the Java runtime, there's no need to provide our own implementation in SimpleCollector.

@fstab
Copy link
Member

fstab commented May 2, 2022

Regarding the specific case of null values: If we revisit this and not throw an IllegalArgumentException, what do you think would be a good default value? The literal string "null" as returned by String::valueOf, or an empty value?

@Tigraine
Copy link
Author

Tigraine commented Jun 2, 2022

Regarding the specific case of null values: If we revisit this and not throw an IllegalArgumentException, what do you think would be a good default value? The literal string "null" as returned by String::valueOf, or an empty value?

Thanks for the feedback! I will update the PR to move to a Transformer as it makes a lot of sense.

I was thinking about the appropriate default value a bit and in our case 'NULL' would be a perfectly good default value - but that might not be for everyone as a default which is the main reason I came up with this PR and the concept of Sanitization.

static LabelValueSanitizer NoOpLabelValueSanitizer() {
return new LabelValueSanitizer() {
@Override
public String[] sanitize(String... labelValue) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change labelValue to labelValues since it's an array.

static LabelValueSanitizer TransformNullLabelsToEmptyString() {
return new LabelValueSanitizer() {
@Override
public String[] sanitize(String... labelValue) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change labelValue to labelValues since it's an array.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants