-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WX-1629] Add metadata value type tests #7433
Conversation
@@ -13,5 +13,5 @@ metadata { | |||
status: Succeeded | |||
"outputs.draft3_infer_version.j": 660 | |||
"actualWorkflowLanguage": WDL | |||
"actualWorkflowLanguageVersion": 1.0 | |||
"actualWorkflowLanguageVersion": "1.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one 😂
...r/src/main/resources/standardTestCases/metadata_type_validation/metadata_type_validation.wdl
Outdated
Show resolved
Hide resolved
...rdTestCases/wdl_biscayne/biscayne_new_runtime_attributes/biscayne_new_runtime_attributes.wdl
Outdated
Show resolved
Hide resolved
...rc/main/resources/standardTestCases/wdl_draft3/draft3_infer_version/draft3_infer_version.wdl
Outdated
Show resolved
Hide resolved
case JsNull => (expected != JsNull).option(s"expected: $cacheSubstitutions but got: $actual") | ||
case _ => Option(s"expected: $cacheSubstitutions but got: $actual") | ||
} | ||
|
||
matchError.map(s"Metadata mismatch for $key - " + _) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've done an admirable job creating your own string-to-array parser here, but I wonder if there's anything available in spray that can do this for us. Nice to outsource JSON serialization and deserialization wherever we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, spray has a parseJson
method which was able to do this for us!
case Result.Failure(_) => invalidNel(s"Metadata block can not be converted to a Map: $config") | ||
def fromConfig(config: Config): ErrorOr[WorkflowFlatMetadata] = { | ||
val values: scala.collection.mutable.Map[String, JsValue] = scala.collection.mutable.Map[String, JsValue]() | ||
config.entrySet().stream().forEach { value => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above - any chance spray can help us with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think spray can help us with this one, by this point we aren't dealing with JSON anymore. I looked around for a function that could do this for us and it doesn't seem like any such function exists.
@@ -29,16 +29,7 @@ class WriteMetadataActor(override val batchSize: Int, | |||
private val statsRecorder = MetadataStatisticsRecorder(metadataStatisticsRecorderSettings) | |||
|
|||
override def process(e: NonEmptyVector[MetadataWriteAction]) = instrumentedProcess { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just rearrangement or are there logic changes here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just rearrangement to make the process method more easily unit testable.
...rc/main/resources/standardTestCases/wdl_draft3/draft3_infer_version/draft3_infer_version.wdl
Outdated
Show resolved
Hide resolved
@@ -14,9 +14,9 @@ metadata { | |||
"outputs.biscayne_read_functions_windows_line_endings.map.y": 600 | |||
"outputs.biscayne_read_functions_windows_line_endings.map.z": 700 | |||
"outputs.biscayne_read_functions_windows_line_endings.map.x": 500 | |||
"outputs.biscayne_read_functions_windows_line_endings.tsv.0": "[\"line1\",\"line one\"]" | |||
"outputs.biscayne_read_functions_windows_line_endings.tsv.1": "[\"line2\",\"line two\"]" | |||
"outputs.biscayne_read_functions_windows_line_endings.tsv.2": "[\"line3\",\"line three\"]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super nice, I've definitely been annoyed in the past about having to craft just the right escape sequence for a test.
} else { | ||
IO.unit | ||
} | ||
} | ||
|
||
private def checkIfActuallySame(firstSet: Set[(String, JsValue)], secondSet: Set[(String, JsValue)]): Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming: this function alludes to a "kinda same"/"actually same" duality... not sure that's what's happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially the actual outputs are in a "cleaned" state where extraneous quotation marks have been removed, but the expected outputs might have extra quotation marks if the output is a String (e.g., the actual output will be "hello" but the expected output will read ""hello""). This function just cleans the outputs to make sure that we're not reporting a false difference caused by quotes that should have been removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment to that affect?
HOCON). However values coming from the metadata endpoint are JsValue. At the moment we're only using | ||
JsString, JsNumber and JsBoolean for comparison so this is a hacky way of handling that situation. It's | ||
entirely likely that it won't survive long term. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@geoffjentry we determined that 8 years is the going definition of "long term".
# Due to Centaur changes in WX-1629, lots_of_inputs_papiv2 times out due to the large number of inputs. Instead, | ||
# we run lots_of_inputs, which tests 400 inputs instead of the 10,000 in lots_of_inputs_papiv2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from this test – did the slower Centaur evaluation inflate the overall suite time? I can't immediately intuit whether it would matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparing the length of time of the tests on this PR to the nightly integration tests, the timings are about the same (some suites are a little faster, some are a little slower, all are pretty close though). I think that one test was problematic because it creates an array of 10,000 Files, which causes a recursion stack of 10,000 calls when we process the array. No other tests create such large arrays, and I think that call stack might have caused memory issues and slowness (I didn't test this hypothesis, but it would make sense since that seems to be the main difference between this test and others).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 sounds fine to me, since I only care about performance to the extent that I have to wait for tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job wrangling this!
} else { | ||
IO.unit | ||
} | ||
} | ||
|
||
private def checkIfActuallySame(firstSet: Set[(String, JsValue)], secondSet: Set[(String, JsValue)]): Boolean = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment to that affect?
centaur/src/main/scala/centaur/test/metadata/WorkflowFlatMetadata.scala
Outdated
Show resolved
Hide resolved
|
||
services { | ||
MetadataService { | ||
metadata-keys-to-sanitize-utf8mb4 = ["submittedFiles:workflow", "commandLine", "outputs:metadata_type_validation.validate_boolean.boolean_output"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was special about outputs:metadata_type_validation.validate_boolean.boolean_output
that caused it to be added here? Might want a comment explaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing special about that key, I added it to prove through integration testing that sanitizing metadata keys won't convert non-Strings to Strings (essentially I just picked that one randomly).
...rc/main/resources/standardTestCases/wdl_draft3/draft3_infer_version/draft3_infer_version.wdl
Outdated
Show resolved
Hide resolved
# timeout-minutes: 30 | ||
# with: | ||
# limit-access-to-actor: true | ||
# detached: true | ||
#This script bascially just looks up another script to run, assuming that the other script's filename is: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is left in here commented-out intentionally and I support it 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct!
Follow-up to WX-1410 to add additional testing of metadata value types to ensure that values are inserted into the database with correct types.