-
Notifications
You must be signed in to change notification settings - Fork 797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#342 - Implemented Label Value Sanitizer #776
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Daniel Hoelbling-Inzko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR. I like the idea of having a generic way to transform labels. I added a few thoughts as review comments. Let me know what you think.
@@ -0,0 +1,5 @@ | |||
package io.prometheus.client; | |||
|
|||
public interface LabelValueSanitizer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say this can also be used for other use cases, not just "sanitization". So maybe the interface should have a more generic name covering all use cases, like LabelValueTransformer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @fstab ... LabelValueTransformer
seems like it could be used more generically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be future use cases/places where the paradigm could be used to transform other (non-label) String
values, so I feel it would make more sense to rename the interface to StringTransformer
.
package io.prometheus.client; | ||
|
||
public interface LabelValueSanitizer { | ||
String[] sanitize(String... labelValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to sanitize only a single label value, not the whole array. For example, you could register a sanitizer like this:
metric = Gauge.build()
.name("nonulllabels")
.help("help")
.labelNames("path", "status")
.labelValueSanitizer("path", myPathSanitizer)
.register(registry);
and then the myPathSanitizer
would only be called for the path label value, but not for an array of all label values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since labelNames
takes a String
array, I think it makes sense for the class to manipulate the labels also takes a String
array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on what use case you have in mind.
What I have in mind is that you have a specific label that you want to transform (for example stripping a unique user ID from an HTTP path). For that, it would be good to implement a specific pathTransformer
and apply it to the path
label. Advantage is: You don't need to magically know at which position in the array the path
label is. And if path
is passed as a label to multiple different metrics, you can reuse the transformer, even if the path
is on different positions for each metric.
The array as parameter only makes sense if you want to do exactly the same transformation for all labels. This would work for replacing null
with "null"
, but for every other use case I think it's better if you know which label you want to transform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both might actually be interesting.
Maybe having a Transformer be applied to a specific label or *
to apply to all?
I could for example think of a case where you'd want to have your infrastructure automatically prevent users from using excessively long labels (truncating them automatically).
/** | ||
* Transforms null labels to an empty string to guard against IllegalArgument runtime exceptions | ||
*/ | ||
static LabelValueSanitizer TransformNullLabelsToEmptyString() { | ||
return new LabelValueSanitizer() { | ||
@Override | ||
public String[] sanitize(String... labelValue) { | ||
for (int i = 0; i < labelValue.length; i++) { | ||
if (labelValue[i] == null) { | ||
labelValue[i] = ""; | ||
} | ||
} | ||
return labelValue; | ||
} | ||
}; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the sanitizer takes only a single string as argument instead of an array of strings, a good handler for null values would be String::valueOf
. As this already exists in the Java runtime, there's no need to provide our own implementation in SimpleCollector
.
Regarding the specific case of |
Thanks for the feedback! I will update the PR to move to a Transformer as it makes a lot of sense. I was thinking about the appropriate default value a bit and in our case 'NULL' would be a perfectly good default value - but that might not be for everyone as a default which is the main reason I came up with this PR and the concept of Sanitization. |
static LabelValueSanitizer NoOpLabelValueSanitizer() { | ||
return new LabelValueSanitizer() { | ||
@Override | ||
public String[] sanitize(String... labelValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change labelValue
to labelValues
since it's an array.
static LabelValueSanitizer TransformNullLabelsToEmptyString() { | ||
return new LabelValueSanitizer() { | ||
@Override | ||
public String[] sanitize(String... labelValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change labelValue
to labelValues
since it's an array.
@brian-brazil as discussed (a few years ago 🤣 ) here #342
This PR adds the ability to all SimpleCollector instances to manipulate labels before they are processed.
This allows users to extend a label to be runtime safe or do manipulation like removing credentials etc..
Example would be:
Usage:
Instead of having to sanitise all labels before calling the instrumentation code we can make the code safe at setup and not care about downstream mis-use.
Also since this is implemented through an interface additional use cases can be passed by consumers of the library.