Skip to content
This repository has been archived by the owner on Jun 16, 2023. It is now read-only.

Update ensg-hugo-rmtl-mapping.tsv in v7 #125

Closed
Tracked by #92
jharenza opened this issue Jul 20, 2021 · 2 comments
Closed
Tracked by #92

Update ensg-hugo-rmtl-mapping.tsv in v7 #125

jharenza opened this issue Jul 20, 2021 · 2 comments
Assignees

Comments

@jharenza
Copy link
Collaborator

What data file(s) does this issue pertain to?

ensg-hugo-rmtl-v1-mapping.tsv

What release are you using?

v6

Put your question or report your issue here.

The table has been updated per Zachary Dorman:

FYI all - I put an updated version of RMTL_Expanded.xlsx (for Excel browsing) and a stripped down RMTL.csv (for computable use) in Box here: https://nih.app.box.com/folder/135489780289
The primary change is incorporating the FDA's gene abnormality column into the FDA Target for all targets for more complete display in Open Targets. As for actual target changes:
One unique target was added to the list (BCR)
"CBL" was changed to "CRBN" based on comments from the FDA
Several non-unique targets were added to the list due to a more formulaic approach to isolating gene abnormalities listed
I started a more detailed change log in the same folder if interested
Dev instance of Open Targets will reflect these changes in the near future.

Also attached below:
RMTL.csv

I will also update the filename to ensg-hugo-rmtl-mapping.tsv so it is not changing with each release since we will capture the versioning within the file.

@jharenza jharenza self-assigned this Jul 20, 2021
@jharenza jharenza mentioned this issue Jul 20, 2021
18 tasks
@jharenza
Copy link
Collaborator Author

jharenza commented Jul 23, 2021

From Zach Dorman:

@jo Lynne Rokita it looks like we will go with the #.# model for versioning. First value is FDA release, second value is our iteration within that FDA release. (I'll write up details). There's strong preference to start the convention with the new RMTL version from this week as v1.0. Will your systems be able to differentiate the original file v1 from the updated v1.0 ? I'm assuming it's stored as a string so 1 =/= 1.0, but want to make sure.

The new file used in the merge is https://nih.box.com/s/mdjpng8thhfhrmemi3v30tvuox2ho1pq

jharenza pushed a commit to d3b-center/OpenPedCan-analysis that referenced this issue Jul 23, 2021
# release notes
## curremt release
- release date: 2021-07-23
- status: available
- changes:
   - Updated EFO/MONDO mapping file per [ticket 88](d3b-center/ticket-tracker-OPC#88)
   - Added GTEX UBERON mapping files for subgroup and group per [ticket 85](d3b-center/ticket-tracker-OPC#85)
     - Collapsed `Cerebellum hemisphere` and `Cerebellum` to `Cerebellum` since GTEX has the same UBERON code listed for both per [ticket 106](d3b-center/ticket-tracker-OPC#106)
     - Renamed `Whole Blood` to `Blood` per John Maris's suggestion to alphabetize and [ticket 106](d3b-center/ticket-tracker-OPC#106).
       - Note: v8 gtex lists "whole blood" subgroup and "whole blood" group as UBERON_0013756, which is mapped to venous blood and "Whole blood" maps to UBERON_0000178.
       - After inquiry at GTEx, we were told they are equivalent terms as seen in [this link](https://www.ebi.ac.uk/ols/ontologies/uberon/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FUBERON_0013756)
    - Updated `ensg-hugo-rmtl-v1-mapping.tsv` with minor updates according to [ticket 125](d3b-center/ticket-tracker-OPC#125) and changed filename to `ensg-hugo-rmtl-mapping.tsv`
    - Histology file updates:
      - Collapsed `Cerebellum hemisphere` and `Cerebellum` to `Cerebellum` since GTEX has the same UBERON code listed for both per [ticket 106](d3b-center/ticket-tracker-OPC#106)
      - Renamed `Whole Blood` to `Blood` per John Maris's suggestion to alphabetize and [ticket 106](d3b-center/ticket-tracker-OPC#106).
      - Added inferred strandedness to RNA-Seq samples per [ticket 104](d3b-center/ticket-tracker-OPC#104)
      - Added `broad_tumor_descriptor` to designate grouped `Diagnosis` and `Relapse` samples used in SNV, CNV, Fusion tables as well as for grouping on pedcbio per [ticket 109](d3b-center/ticket-tracker-OPC#109)
      - Added ploidy information for TARGET AML and NBL WXS samples per [ticket 121](d3b-center/ticket-tracker-OPC#121)
      - Collapsed TARGET ids containing suffixes to match the BAM file sample IDs from GDC and match the RDS processed files per [comment here](https://github.com/d3b-center/D3b-codes/pull/41#issuecomment-885809293)
    - Added TARGET NBL and AML WXS, PBTA WXS CNV calls to `cnv-cnvkit.seg.gz` and `cnv-controlfreec.tsv.gz` per [ticket 80](d3b-center/ticket-tracker-OPC#80)
    - Added `consensus_wgs_plus_cnvkit_wxs_x_and_y.tsv.gz` (removed WGS only file `consensus_seg_annotated_cn_autosomes.tsv.gz`) and `consensus_wgs_plus_cnvkit_wxs_autosomes.tsv.gz` (removed WGS only file `consensus_seg_annotated_cn_x_and_y.tsv.gz`) containing consensus WGS and CNVkit WXS data per [ticket 102](d3b-center/ticket-tracker-OPC#102)
    - Updated RNA-Seq files to include TARGET RNA (N = 1329) samples:
      - fusion-arriba.tsv.gz
      - fusion-starfusion.tsv.gz
      - fusion-putative-oncogenic.tsv
      - gene-counts-rsem-expected_count-collapsed.rds
      - gene-expression-rsem-tpm-collapsed.rds
@jharenza
Copy link
Collaborator Author

closing with PR 61

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant