Callset statistics [VS-560] #8018

mcovarr · 2022-09-14T12:14:27Z

Successful Quickstart run here, has not yet been run on larger datasets.

codecov · 2022-09-14T12:34:34Z

Codecov Report

❗ No coverage uploaded for pull request base (ah_var_store@3b74d0a). Click here to learn what that means.
The diff coverage is n/a.

Additional details and impacted files

@@               Coverage Diff                @@
##             ah_var_store     #8018   +/-   ##
================================================
  Coverage                ?   86.226%           
  Complexity              ?     35201           
================================================
  Files                   ?      2173           
  Lines                   ?    165004           
  Branches                ?     17792           
================================================
  Hits                    ?    142277           
  Misses                  ?     16393           
  Partials                ?      6334

gbggrant

Where do the outputs go? I don't see anything in the ~{extract_prefix}_statistics table that got created in my test run.

Also, I kind of think a text file as output would be very useful for analysis/reporting.

RoriCremer · 2022-09-19T15:48:25Z

scripts/variantstore/wdl/GvsCallsetStatistics.wdl

+            exit 1
+        fi
+
+        # Schemas extracted programatically: https://stackoverflow.com/a/66987934


RoriCremer · 2022-09-19T16:00:59Z

scripts/variantstore/wdl/GvsCallsetStatistics.wdl

+            singleton,
+            pass_qc
+        )
+        SELECT "~{filter_set_name}" filter_set_name,


I know that none of the explanations are in the code that you are looking at to write this wdl, but I think getting Lee to add some context about what is being calculated would be really helpful. I'm fine with that being a future ticket

gbggrant · 2022-09-19T18:05:50Z

So, ran it on a previously run set and it failed as there were rows in the database tables.
Then dropped the tables and it failed because I dropped the tables. Presumably need a volatile=true on the CreateTables task (and make it find or create)?

gbggrant · 2022-09-20T16:33:55Z

Reran it here and it succeeded, data looks good as far as I can tell.

rsasch

As far as I can tell, this workflow creates the statistics table but does not output the contents into a TSV or CSV, which is what we deliver along with the callset. Would it be possible to add an export to TSV to a specified GCS location to the CollectStatistics task (or a new one)?

rsasch · 2022-09-21T20:25:17Z

On the good news front, I compared an export of the "statistics_table" to the callset stats file I generated for Beta and they matched! 👍🏻 (if you're curious, the run is https://app.terra.bio/#workspaces/allofus-drc-wgs-dev/AoU_DRC_WGS_12-6-21_beta_ingest/job_history/45a7764c-9f8f-49e3-b1f6-2bf28ac16b4b)

mcovarr · 2022-09-22T23:23:12Z

now with export to CSV

mcovarr · 2022-09-23T13:24:27Z

Successful run here https://app.terra.bio/#workspaces/gvs-dev/mlc%20GVS%20Quickstart%20v3/job_history/3cacea14-3c61-42ef-9370-2677af5e14d3

rsasch · 2022-09-26T15:04:26Z

scripts/variantstore/wdl/GvsCallsetStatistics.wdl

+    command <<<
+        set -o errexit -o nounset -o xtrace -o pipefail
+
+        bq query --nouse_legacy_sql --project_id=~{project_id} --format=csv '


you probably need to include a --max_rows with the number of samples, otherwise the file will be limited to 100 rows (see https://stackoverflow.com/questions/34215311/how-bq-query-can-get-10000-rows)

mcovarr requested review from gbggrant, koncheto-broad, RoriCremer and rsasch September 14, 2022 12:14

gbggrant reviewed Sep 14, 2022

View reviewed changes

mcovarr requested a review from gbggrant September 15, 2022 20:48

RoriCremer reviewed Sep 19, 2022

View reviewed changes

mcovarr force-pushed the vs_560_callset_stats branch from 9a22b7a to 99c60af Compare September 20, 2022 09:00

gbggrant approved these changes Sep 20, 2022

View reviewed changes

rsasch suggested changes Sep 21, 2022

View reviewed changes

mcovarr force-pushed the vs_560_callset_stats branch from 6052b1d to d8b7470 Compare September 22, 2022 20:46

mcovarr requested a review from rsasch September 22, 2022 23:23

rsasch reviewed Sep 27, 2022

View reviewed changes

mcovarr added 10 commits September 27, 2022 09:34

wip

a158218

oops

c79cb4a

wip

6a7e215

wip

f2a2054

more

0076010

oops

1565d3a

try again

38e6ba4

maybe

61fa15d

whoops

b00ef40

add chromosome to metrics

eb84815

mcovarr added 12 commits September 27, 2022 09:34

final testing

7225b35

cleanup

7af5428

whoops

d93fa41

debug

3391eff

fixes

8f9f07b

PR feedback

18505ea

more

8b3db37

expor

5fa8503

csv

1da1a2d

not sure what happened

ae58faa

oops

60598a3

PR feedback, cleanup

0c53018

mcovarr force-pushed the vs_560_callset_stats branch from 513244f to 0c53018 Compare September 27, 2022 13:44

rsasch approved these changes Sep 27, 2022

View reviewed changes

mcovarr merged commit 953f68c into ah_var_store Sep 27, 2022

mcovarr deleted the vs_560_callset_stats branch September 27, 2022 17:03

This was referenced Mar 17, 2023

lb merge gvs branch #8248

Closed

testing something, please ignore #8251

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Callset statistics [VS-560] #8018

Callset statistics [VS-560] #8018

mcovarr commented Sep 14, 2022 •

edited

Loading

codecov bot commented Sep 14, 2022 •

edited

Loading

gbggrant left a comment

RoriCremer Sep 19, 2022

RoriCremer Sep 19, 2022

gbggrant commented Sep 19, 2022

gbggrant commented Sep 20, 2022

rsasch left a comment •

edited

Loading

rsasch commented Sep 21, 2022 •

edited

Loading

mcovarr commented Sep 22, 2022

mcovarr commented Sep 23, 2022

rsasch Sep 26, 2022

Callset statistics [VS-560] #8018

Callset statistics [VS-560] #8018

Conversation

mcovarr commented Sep 14, 2022 • edited Loading

codecov bot commented Sep 14, 2022 • edited Loading

Codecov Report

gbggrant left a comment

Choose a reason for hiding this comment

RoriCremer Sep 19, 2022

Choose a reason for hiding this comment

RoriCremer Sep 19, 2022

Choose a reason for hiding this comment

gbggrant commented Sep 19, 2022

gbggrant commented Sep 20, 2022

rsasch left a comment • edited Loading

Choose a reason for hiding this comment

rsasch commented Sep 21, 2022 • edited Loading

mcovarr commented Sep 22, 2022

mcovarr commented Sep 23, 2022

rsasch Sep 26, 2022

Choose a reason for hiding this comment

mcovarr commented Sep 14, 2022 •

edited

Loading

codecov bot commented Sep 14, 2022 •

edited

Loading

rsasch left a comment •

edited

Loading

rsasch commented Sep 21, 2022 •

edited

Loading