-
Notifications
You must be signed in to change notification settings - Fork 4
Example2: Testing TADs on all chromosomes
Data is using Amazon.
- Identify TADs in multiple chromosomes simultaneously.
python diffdomain-py3/diffdomains.py dvsd multiple https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic https://hicfiles.s3.amazonaws.com/hiseq/k562/in-situ/combined.hic data/GSE63525_GM12878_primary+replicate_Arrowhead_domainlist.txt --ofile res/temp/temp.txt --reso 10000
Results is saved to <res/temp/temp.txt>
.
- MultiComparison adjustment.
python diffdomain-py3/diffdomains.py adjustment fdr_bh res/temp/temp.txt res/reorganized_TADs_GM12878_K562.txt
Results is saved to <res/reorganized_TADs_GM12878_K562.txt>
.
- optional parameter [--filter], only keeping reorganized TADs with BH < 0.05.
python diffdomain-py3/diffdomains.py adjustment fdr_bh res/temp/temp.txt res/reorganized_TADs_GM12878_K562_filter.txt --filter true
Results is saved to <res/reorganized_TADs_GM12878_K562_filter.txt>
.
- Classification of TADs into six subtypes.
In this step, you will need the TAD list form condition 2.
(e.g. data/GSE63525_K562_Arrowhead_domainlist.txt)
Running the command:
python diffdomain-py3/classification.py -d res/reorganized_TADs_GM12878_K562.txt -t data/GSE63525_K562_Arrowhead_domainlist.txt --out res/reorganized_TADs_GM12878_K562_subtypes.txt
Results is saved to <res/reorganized_TADs_GM12878_K562_subtypes.txt>
.
- Subdividing the strength-change type into two categories
Running the command:
python diffdomain-py3/subdivide_strength_change.py -f res/reorganized_TADs_GM12878_K562_subtypes.txt -h1 https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic -h2 https://hicfiles.s3.amazonaws.com/hiseq/k562/in-situ/combined.hic -t1 data/GSE63525_GM12878_primary+replicate_Arrowhead_domainlist.txt -t2 data/GSE63525_K562_Arrowhead_domainlist.txt --out res/reorganized_TADs_GM12878_K562_subtypes_up_down.txt --reso 10000
Final results is saved to <res/reorganized_TADs_GM12878_K562_subtypes_up_down.txt>
.
The output file <res/reorganized_TADs_GM12878_K562_subtypes_up_down.txt>
comprises multiple columns.
- Chromosome Number (chr), Start Position (start), End Position (end), range, type, origin, subtype, significant.
Each row stores information about different TADs.
Reorganized TADs are categorized into six types: loss, merge, split, complex, zoom, and strength-change.
-
Type column includes the types : loss, merge, split, complex.
-
Subtype column includes zoom and strength-change.
-
subdivide_strength_change column includes two subtypes of strength-change : strength-change up and strength-change down.
A strength-change TAD can be classified into two subtypes.
-
Strength-change up TAD : Indicates an increase in Hi-C contact frequencies under biological condition 2.
-
Strength-change down TAD : Indicates a decrease in Hi-C contact frequencies under biological condition 2.
The classification is based on the following mathematical definitions.
Given a strength-change TAD :
- m1: Median value of KR-normalized Hi-C contact frequencies within the TAD in condition 1.
- m2: Median value of KR-normalized Hi-C contact frequencies within the same TAD region in condition 2.
- s1: Sum of KR-normalized Hi-C contact frequencies across all condition 1 TADs.
- s2: Sum of KR-normalized Hi-C contact frequencies across all condition 2 TADs.
If the condition (m1 / m2) * (s1 / s2) < 1 is satisfied, the TAD is classified as a strength-change up TAD. Otherwise, it is classified as a strength-change down TAD.
Here are a few simple demonstrations of the output from the subdivide_strength_change.py script.
- Filtering TADs Reorganization Types by Location
You can use 'chr', 'start', 'end', or 'range' to filter specific TADs or sets of TADs.
Subsequently, you can directly observe the reorganization types of these TADs within the 'type', 'subtype', or 'subdivide_strength_change' columns.The significance of this reorganization is shown in the 'significant' column (0 means not significant, 1 means significant).
For example, by specifying chr=1, start=20680000, end=20830000, origin='condition1', you will locate the specific TAD in the file and determine that its reorganization type is 'loss', significant reorganization.
Loss: "Condition 2 has no TAD that overlaps with or is identical to the reorganized TAD."
- Filtering TADs by Reorganization Type
Alternatively, you can choose your interested reorganization type from 'type', 'subtype', or 'strength change' to see all related TADs.
For instance, setting type='merge' and origin='condition1' allows you to query all TADs with a reorganization type of 'merge' (biological condition 1).
For each of significant reorganization TADs, except for those classified as 'loss', corresponding entries in biological condition 2 are provided immediately following the information for condition 1.
Merge: "The reorganized TAD has a many-to-one identical or overlapping relationship with a TAD in condition 2."
- Other Examples
Next, illustrations will display the representation of various reorganization types in the output file. Red rectangles highlight the TADs identified in biological condition 1 and their corresponding status in biological condition 2, for a specific reorganization type.
Split: "The reorganized TAD has either a one-to-many identical relationship or a one-to-many overlapping relationship with TADs in condition 2."
Zoom: "The reorganized TAD has a one-to-one overlapping relationship with a TAD in condition 2."
Strength-change: "The reorganized TAD in condition 1 has a one-to-one identical relationship with a TAD in condition 2."
Complex: "All remaining reorganized TADs that do not fit into the previously defined sub-types are classified as complex TADs."
DiffDomain~Wiki