1.1 Usage

Welcome to the DiffDomain wiki! There is a summary of usage.

Usage

Main method

Usage:

 python diffdomains.py dvsd one <chr> <start> <end> <hic0> <hic1> [options]

 python diffdomains.py dvsd multiple <hic0> <hic1> <tadlist_of_hic0.bed> [options]

 python diffdomains.py visualization <chr> <start> <end> <hic0> <hic1> [options]

 python diffdomains.py adjustment <method> <output_of_dvsd_multiple> <output_file_name> [options]

Options:

--reso resolution for hicfile [default: 100000]
--min_nbin effective number of bin [default: 10]
--f filtering parameter for filtering the null values of the matrix[0~1) [default: 0.5]

For example, when setting ‘--f 0.6’, in the contact matrix of a TAD, if the number of the columns, whose proportions of missing values is lower than 40%, is smaller than min_nbin, DiffDomain will skip comparing this TAD anymore and set its result (statistics, the 5th column ; P value, the 6th column) as NAN.
In other words, DiffDomain compares the TAD contact matrixes with no less than min_nbin columns, whose missing values are less than (1-f)*100%.

--ofile : the filepath for output file [default: stdout]
--oprefix : prefix for output files
--oprefixFig : prefix for output figures
--sep : deliminator for hicfile [default: \t]
--hicnorm : hic matrix normalization method [default: KR]
--chrn : chromosome number [default: ALL]
--ncore : the number of parallel process [default: 10]
--filter : wheather to filter out unreorganized TADs after adjustment [default: False]

Note:

for most of the bulk Hi-C data, such as hic data in Adiden [Reference], results is not sensitive to the exact value of --f.
For single-cell Hi-C data, recommend users try multiple values of --f and choose one with acceptable number of TADs compared. Due to high sparisity in single-cell Hi-C data and variation in imputation methods (such as scHiCluster, Higashi, scVI-3D), we did not set a default value of --f.

Classification

Usage:

 python diffdomain-py3/classification.py -d <result_of_diffdomains.py_adjustment> -t <tadlist_of_hic2> [options]

options

--limit : the length of bases, within which the boundaries will be judged as common boundaries [default: 30000]
--out : the filename of output [default: name_of_-d_types.txt] .
--kpercent : the common boundareis are within max(l*bin,k% TAD's length) [default: 10] .
--remote : the limitation of the biggest region [default: 1000000]
--s1 : int, to skip the first s1 rows in -d [default: 0]
--s2 : int, to skip the first s2 rows in -t [default: 0]
--sep1 : the separator of -d [default: \t]
--sep2 : the separator of -t [default: \t]

Note:
You can set the --limit to adjust the 'common boundary'.
As said in paper, we use '3bin' as the filter of common boundaries.
That means if we use the 10kb resolution, we will set --limit as 30000, and if 25kb resolution, --limit will be 75000.

Questions

If you encounter the following question, please don't be too worried.

AttributeError: 'function' object has no attribute 'straw' : You can open the __init__.py of straw ( its pathway will be reported in the error, for example "/home/gum/.conda/envs/diffDomain/lib/python2.7/site-packages/straw/__ init_.py" ) and then deleted the sentence “straw = straw_module.straw”

Now, Let's get started with some examples of real data in the next chapters!

Subdivide_Strength_Change

Usage:

 python diffdomain-py3/subdivide_strength_change.py -f <result_of_classification.py> -h1 <hic_file_of_Condition1> -h2 <hic_file_of_Condition2> -t1 <tadlist_of_hic1> -t2 <tadlist_of_hic2> [options]

options

--out : the filename of output [default: subdivide_strength_change.txt] .
--reso resolution for hicfile [default: 100000]
--sep : the separator for -f,-t1, and -t2 [default: \t]

DiffDomain~Wiki

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.1 Usage

Usage

Main method

Classification

Questions

Subdivide_Strength_Change

Clone this wiki locally