-
Notifications
You must be signed in to change notification settings - Fork 4
1.1 Usage
Welcome to the DiffDomain wiki! There is a summary of usage.
Usage:
-
python diffdomains.py dvsd one <chr> <start> <end> <hic0> <hic1> [options]
-
python diffdomains.py dvsd multiple <hic0> <hic1> <tadlist_of_hic0.bed> [options]
-
python diffdomains.py visualization <chr> <start> <end> <hic0> <hic1> [options]
-
python diffdomains.py adjustment <method> <output_of_dvsd_multiple> <output_file_name> [options]
Options:
- --reso resolution for hicfile [default: 100000]
- --min_nbin effective number of bin [default: 10]
- --f filtering parameter for filtering the null values of the matrix[0~1) [default: 0.5]
For example, when setting ‘--f 0.6’, in the contact matrix of a TAD, if the number of the columns, whose proportions of missing values is lower than 40%, is smaller than min_nbin, DiffDomain will skip comparing this TAD anymore and set its result (statistics, the 5th column ; P value, the 6th column) as NAN.
In other words, DiffDomain compares the TAD contact matrixes with no less than min_nbin columns, whose missing values are less than (1-f)*100%.
- --ofile : the filepath for output file [default: stdout]
- --oprefix : prefix for output files
- --oprefixFig : prefix for output figures
- --sep : deliminator for hicfile [default: \t]
- --hicnorm : hic matrix normalization method [default: KR]
- --chrn : chromosome number [default: ALL]
- --ncore : the number of parallel process [default: 10]
- --filter : wheather to filter out unreorganized TADs after adjustment [default: False]
Note:
- for most of the bulk Hi-C data, such as hic data in Adiden [Reference], results is not sensitive to the exact value of --f.
- For single-cell Hi-C data, recommend users try multiple values of --f and choose one with acceptable number of TADs compared. Due to high sparisity in single-cell Hi-C data and variation in imputation methods (such as scHiCluster, Higashi, scVI-3D), we did not set a default value of --f.
Usage:
-
python diffdomain-py3/classification.py -d <result_of_diffdomains.py_adjustment> -t <tadlist_of_hic2> [options]
options
- --limit : the length of bases, within which the boundaries will be judged as common boundaries [default: 30000]
- --out : the filename of output [default: name_of_-d_types.txt] .
- --kpercent : the common boundareis are within max(l*bin,k% TAD's length) [default: 10] .
- --remote : the limitation of the biggest region [default: 1000000]
- --s1 : int, to skip the first s1 rows in -d [default: 0]
- --s2 : int, to skip the first s2 rows in -t [default: 0]
- --sep1 : the separator of -d [default: \t]
- --sep2 : the separator of -t [default: \t]
Note:
You can set the --limit to adjust the 'common boundary'.
As said in paper, we use '3bin' as the filter of common boundaries.
That means if we use the 10kb resolution, we will set --limit as 30000, and if 25kb resolution, --limit will be 75000.
If you encounter the following question, please don't be too worried.
- AttributeError: 'function' object has no attribute 'straw' : You can open the __init__.py of straw ( its pathway will be reported in the error, for example "/home/gum/.conda/envs/diffDomain/lib/python2.7/site-packages/straw/__ init_.py" ) and then deleted the sentence “straw = straw_module.straw”
Now, Let's get started with some examples of real data in the next chapters!
Usage:
-
python diffdomain-py3/subdivide_strength_change.py -f <result_of_classification.py> -h1 <hic_file_of_Condition1> -h2 <hic_file_of_Condition2> -t1 <tadlist_of_hic1> -t2 <tadlist_of_hic2> [options]
options
- --out : the filename of output [default: subdivide_strength_change.txt] .
- --reso resolution for hicfile [default: 100000]
- --sep : the separator for -f,-t1, and -t2 [default: \t]
DiffDomain~Wiki