Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WX-1410] Sanitize 4 byte UTF-8 characters before inserting into METADATA_ENTRY #7414
[WX-1410] Sanitize 4 byte UTF-8 characters before inserting into METADATA_ENTRY #7414
Changes from 3 commits
efe8ca0
5235ace
9a9521f
e591bca
afc8865
68567c3
7190a69
02f41e2
14ceb40
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I added that in a previous iteration -- its been removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great comment!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small recommendation for performance reasons:
metadataKeysToClean
list every timeprocess
is called by assigning it to alazy val
instead of aval
.lazy val
.ConfigFactory.load()
function more than once.https://leobenkel.com/2020/07/skb-scala-val-lazy-def/#:~:text=The%20keyword%20lazy%20allows%20a,as%20the%20value%20is%20declared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is resolved by passing in the
metadataKeysToClean
at actor creation time, see Janet's comment belowThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be reading this wrong or misunderstanding how configs work - but to me it looks like the config value in
reference.conf
is aList[String]
and not anOption[List[String]]
. Would that be a more appropriate type to pass toas
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A reason to use
Option[List[String]]
is to not crash if the config item is entirely missing. More or less important depending on whether we include the key with a default value inreference.conf
, but good defensive programming either way.Check warning on line 57 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L57
Check warning on line 70 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L70
Check warning on line 72 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L72
Check warning on line 76 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L74-L76
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're doing something a little more complicated than matching on the keys in the config - we're also matching on keys prefixed with them? Was that necessary for the usage we've seen?
It might be easier to think about if we did something like this, which will match the full key or a colon-delimited prefix of one or more items:
Either way, we should update the comment in
cromwell.examples.conf
to reflect the prefix behavior.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this was from an old iteration of code, we only need to sanitize on exact matches. This has been fixed in the new PR.
Check warning on line 83 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L78-L83
Check warning on line 85 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L85
Check warning on line 88 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L88
Check warning on line 93 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L93
Check warning on line 95 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L95
Check warning on line 98 in services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala
Codecov / codecov/patch
services/src/main/scala/cromwell/services/metadata/impl/WriteMetadataActor.scala#L97-L98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the config to include this metadata in what we're sanitizing rather than cleaning it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Followup: in theory we could, but we would need to check every single
executionEvent
, of which there are many, because their keys are all like:Leaving this comment in case anyone else has this idea, but I'm OK leaving this as-is for now. Would be great to add a comment explaining this context, though (we sanitize other metadata values elsewhere, but are handing this one differently).
We'll also need to keep in mind that we may need to do something different for Batch. I'll create a ticket in that epic.
This file was deleted.