Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove pet code from CreateVariantIngestFiles and friends [VS-375] #7773

Merged
merged 12 commits into from
Apr 15, 2022

Conversation

mcovarr
Copy link
Collaborator

@mcovarr mcovarr commented Apr 11, 2022

No description provided.

@gatk-bot
Copy link

gatk-bot commented Apr 11, 2022

Travis reported job failures from build 38745
Failures in the following jobs:

Test Type JDK Job ID Logs
integration openjdk11 38745.12 logs
integration openjdk8 38745.2 logs

@mcovarr mcovarr changed the title wip Remove pet code from CreateVariantIngestFiles [VS-375] Apr 11, 2022
@mcovarr mcovarr changed the title Remove pet code from CreateVariantIngestFiles [VS-375] Remove pet code from CreateVariantIngestFiles and friends [VS-375] Apr 12, 2022
@mcovarr mcovarr marked this pull request as ready for review April 12, 2022 22:03
@mcovarr mcovarr force-pushed the mc_createvariantingestfiles_pet_cruft branch from 957d1ba to f157925 Compare April 13, 2022 17:43
@@ -123,7 +123,7 @@ task CheckForDuplicateData {
# check the INFORMATION_SCHEMA.PARTITIONS table to see if any of input sample names/ids have data loaded into their partitions
# this returns the list of sample names that do already have data loaded
echo "WITH items as (SELECT s.sample_id, s.sample_name, s.is_loaded, s.withdrawn FROM \`${TEMP_TABLE}\` t left outer join \`${SAMPLE_INFO_TABLE}\` s on (s.sample_name = t.sample_name)) " >> query.sql
echo "SELECT i.sample_name FROM \`${INFO_SCHEMA_TABLE}\` p JOIN items i ON (p.partition_id = CAST(i.sample_id AS STRING)) WHERE p.total_logical_bytes > 0 AND (table_name like 'ref_ranges_%' OR table_name like 'vet_%' OR table_name like 'pet_%')" >> query.sql
echo "SELECT i.sample_name FROM \`${INFO_SCHEMA_TABLE}\` p JOIN items i ON (p.partition_id = CAST(i.sample_id AS STRING)) WHERE p.total_logical_bytes > 0 AND (table_name like 'ref_ranges_%' OR table_name like 'vet_%')" >> query.sql
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outside the scope of this pr---but this might be a good place to collect info on which samples have been loaded into just the vet or just re_ranges

if (enablePet || enableReferenceRanges) {
refCreator = new RefCreator(sampleIdentifierForOutputFileName, sampleId, tableNumber, seqDictionary, gqStateToIgnore, dropAboveGqThreshold, outputDir, outputType, enablePet, enableReferenceRanges, projectID, datasetName);
if (enableReferenceRanges) {
//noinspection ConstantConditions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to complain, but we do this for the VET as well.!??!...seems like unneeded complexity esp since it's also a param

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷

@@ -279,12 +258,10 @@ public void apply(final VariantContext variant, final ReadsContext readsContext,
refCreator.apply(variant, intervalsToWrite);
}
} catch (IOException ioe) {
throw new GATKException("Error writing PET", ioe);
throw new GATKException("Error writing reference ranges", ioe);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here, are we then assuming enableReferenceRanges is true?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, it looks like there should be an if here like the one for vet on line 250.

@@ -35,8 +32,8 @@
* Ingest variant walker
*/
@CommandLineProgramProperties(
summary = "Exome and Genome Ingest tool for the Joint Genotyping in Big Query project",
oneLineSummary = "Ingest tool for BQJG",
summary = "Exome and Genome Ingest tool for the Genomic Variant Store in Big Query project",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could drop the "in Big Query project"

shortName = "ep",
doc = "write pet data",
optional = true)
public boolean enablePet = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we from the default as true here, maybe we make it default to true on ref ranges above?

@mcovarr mcovarr requested a review from RoriCremer April 13, 2022 19:11
@mcovarr mcovarr force-pushed the mc_createvariantingestfiles_pet_cruft branch from 351e9be to 6843ef4 Compare April 14, 2022 14:43
@mcovarr mcovarr merged commit 7a15427 into ah_var_store Apr 15, 2022
@mcovarr mcovarr deleted the mc_createvariantingestfiles_pet_cruft branch April 15, 2022 17:45
@lbergelson lbergelson mentioned this pull request Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants