Moving the WDL for importing array manifest to BQ #6860

meganshand · 2020-10-02T12:39:17Z

I moved the WDL for importing the array manifest from the variantstore repo and added a test. The test here only checks that the WDL succeeded, it doesn't look a the results (yet). It's ingesting the manifest to a dataset with a 7 day TTL, so the tables eventually get cleaned up. That might be too long for this case, since it adds a table each time the test is run (so on push and PR).

I plan to add more of the "end-to-end" pipeline with more testing in the future using a similar scheme, so welcome feedback on the structure.

ahaessly

just one change needed

ahaessly · 2020-10-02T16:12:51Z

scripts/variantstore_wdl/ImportArrayManifest.wdl

+    # checking for != "build37Flag" skips the header row (we don't want that numbered)
+    # only process rows with 29 fields - this skips some header info fields
+    # also skip entries that are flagged, not matched or have index conflict
+    awk -F ',' 'NF==29 && ($29!="ILLUMINA_FLAGGED" && $29!="INDEL_NOT_MATCHED" && $29!="INDEL_CONFLICT" && $29!="build37Flag") { flag=$29; if ($29=="PASS") flag="null"; print id++","$2","$9","$23","$24","$25","$26","$27","flag }' $TMP_SORTED > $TMP_PROC


id++ should be changed to ++id
this was in a recent PR of mine, but the rest looks correct.
Eventually we should add a test that verifies 1. there is no probe 0 and 2. that 2 different runs to this code yield the same probe->id map (these were both bugs in the past. and don't feel like you have to do that for this PR)

Definitely considering it a TODO to add tests that actually make sure the data that was imported is correct. I haven't figured out the best way to do this though. Maybe I need a GATK tool that pulls down a table and compares to an expected file? Or maybe by looking at the extracted VCF from the full "end-to-end" test we'll have enough coverage? I'm not sure if that part of the pipeline pulls down information in the manifest.

kcibul · 2020-10-05T14:08:33Z

scripts/variantstore_cromwell_tests/import_array_manifest_test.json

+{
+  "ImportArrayManifest.extended_manifest_csv":"/home/travis/build/broadinstitute/gatk/src/test/resources/org/broadinstitute/hellbender/tools/variantdb/arrays/tiny_manifest.csv",
+  "ImportArrayManifest.manifest_schema_json":"/home/travis/build/broadinstitute/gatk/scripts/variantstore_wdl/schemas/manifest_schema.json",
+  "ImportArrayManifest.project_id":"broad-dsde-dev",


Is this just copied from elsewhere? Seems like there should be a gatk-test google project...

This is the GATK test project AFAIK. It's used in the BigQueryUtils tests (in GATK).

kcibul · 2020-10-05T14:15:11Z

scripts/variantstore_wdl/ImportArrayManifest.wdl

+    String? docker
+  }
+
+  String docker_final = select_first([docker, "us.gcr.io/broad-gatk/gatk:4.1.7.0"])


Are you overriding this? Or are we always testing with this static GATK version?

I'm not overriding it in this case because we're not using any GATK tools in this WDL. But maybe we should still be testing the current branch regardless? I know we'll need to for future WDLs (Ingest, calculate metrics, and extract will all use GATK tools that will need to be in the docker from the current branch)

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

meganshand added 2 commits October 1, 2020 08:35

Copying wdl from variantstore repo

2256c03

Adding tests and changes to WDL

feeaa1c

meganshand requested review from kcibul and ahaessly October 2, 2020 12:39

ahaessly approved these changes Oct 2, 2020

View reviewed changes

addressing comments

df29cd9

kcibul approved these changes Oct 5, 2020

View reviewed changes

adding readme

4359bda

meganshand merged commit 517caeb into ah_var_store Oct 5, 2020

meganshand deleted the ms_manifest branch October 5, 2020 18:33

kcibul pushed a commit that referenced this pull request Jan 29, 2021

Moving the WDL for importing array manifest to BQ (#6860)

e36c63b

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

kcibul pushed a commit that referenced this pull request Feb 1, 2021

Moving the WDL for importing array manifest to BQ (#6860)

59c4a30

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

Marianie-Simeon pushed a commit that referenced this pull request Feb 16, 2021

Moving the WDL for importing array manifest to BQ (#6860)

3a930f2

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

kcibul pushed a commit that referenced this pull request Mar 9, 2021

Moving the WDL for importing array manifest to BQ (#6860)

4303950

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021

Moving the WDL for importing array manifest to BQ (#6860)

2b37d2f

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

mmorgantaylor pushed a commit that referenced this pull request Apr 6, 2021

Moving the WDL for importing array manifest to BQ (#6860)

a952ce0

* Copying wdl from variantstore repo * Adding tests and changes to WDL * addressing comments * adding readme

This was referenced Mar 17, 2023

lb merge gvs branch #8248

Closed

testing something, please ignore #8251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moving the WDL for importing array manifest to BQ #6860

Moving the WDL for importing array manifest to BQ #6860

meganshand commented Oct 2, 2020

ahaessly left a comment

ahaessly Oct 2, 2020

meganshand Oct 2, 2020

kcibul Oct 5, 2020

meganshand Oct 5, 2020

kcibul Oct 5, 2020

meganshand Oct 5, 2020 •

edited

Loading

Moving the WDL for importing array manifest to BQ #6860

Moving the WDL for importing array manifest to BQ #6860

Conversation

meganshand commented Oct 2, 2020

ahaessly left a comment

Choose a reason for hiding this comment

ahaessly Oct 2, 2020

Choose a reason for hiding this comment

meganshand Oct 2, 2020

Choose a reason for hiding this comment

kcibul Oct 5, 2020

Choose a reason for hiding this comment

meganshand Oct 5, 2020

Choose a reason for hiding this comment

kcibul Oct 5, 2020

Choose a reason for hiding this comment

meganshand Oct 5, 2020 • edited Loading

Choose a reason for hiding this comment

meganshand Oct 5, 2020 •

edited

Loading