-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add key for genomic reference to template #352
Comments
We do actually have, so I think we just need to add it to the template nf-metadata-dictionary/modules/props.yaml Lines 295 to 299 in 91a33e6
|
What value should we be using here? There are a lot of standard genome values, but for my recent annotation (which inspired this issue), the reference is less standard and hosted on synapse: https://www.synapse.org/#!Synapse:syn50670703 I suppose in this case the name we could use is |
Looks like this reference is called |
Yah, I saw that, but I think that the Verily flavor has some additional changes. |
The file linked in the Synapse ID seems to be similar to the following file listed in the
I wish Verily did a better job and assigned unique IDs for the different builds. They released 4 slightly different builds in one version! :( I guess you could also just link to the SynapseID where this specific build is stored. |
Ah haha, this is very confusing. I didn't read the README, I just looked at this part:
I just read this as "GCA_000001405.15_GRCh38_no_alt_plus_hs38d1_analysis_set.fna.gz" is the 'vanilla' genome before Verily did anything to it. But yeah, the README makes things less clear.... |
List the new key
Using OLS or a similar resource, please find a standard term that fits your needs. OLS-listed NCIT and OBI dictionaries are preferred sources. Non-OLS sources of ground truth for specific items such as ONCOTREE for cancer types, or GEO for platform definitions are also appropriate.
Key:
reference_genome_build
Provide definition and source for the new key
Definition: The source-specific version of the published genome assembly. [ NCI ]
Source url: https://www.ebi.ac.uk/ols/ontologies/ncit/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FNCIT_C164815
Please describe why this concept is necessary
For aligned read files it is important to know which reference genome the reads were aligned to. This helps users decide downstream analytical steps (i.e. whether to re-align or liftover to newer builds) while re-using the data.
Please describe the situations in which this concept may apply
This concept is important if 1) a user is looking for additional data that matches their own analytical dataset (i.e. aligned to same genome build), 2) needs to decide on downstream analytical steps after downloading the dataset from NF Data Portal.
The text was updated successfully, but these errors were encountered: