IAGS_AUTO

The project is based on IAGS and automates process

Download

You can use this tool via conda, or by downloading the source code.

conda (Recommend)

Creating a Virtual Environment
```
conda create -n iags_auto python=3.9
```

Download

conda install -c gurobi -c conda-forge -c huntguo iags_auto

Source code

Download the source code

wget https://codeload.github.com/99gloom/IAGS_AUTO/zip/refs/heads/main

Download mono to run DRIMM
```
sudo apt install mono-devel
```

Note: After downloading both methods, you need to activate gurobi, here we provide a help document to help you get the license.

Usage

Files

The IAGS_AUTO tool requires three types of files: species GFF files, an orthogroup.tsv file, and a species.tree file. Please place these three types of files in the same folder. The GFF and orthogroup.tsv files are the same as those required for the previous script (processDrimm).

The following will be introduced one by one:

GFF files: The GFF files have the same format as the input for MCScanX. The file contains four columns, namely chromosome name, gene name, gene start coordinate, and gene end coordinate. The format is as follows:
```
sp_name  gene_name  starting_position  ending_position
```
Orthogroups.tsv: The output file of OrthoFinder.
species.tree: WGD-Newick format. Essentially a modified version of the Newick format, with the addition of "[WGD]" markers at the WGD (Whole Genome Duplication) positions in the tree. In the figure below, the red dots represent WGD markers. (At the end of the document, whether there is a ';' or not is acceptable)

Application

command	parameters	instructions
-f, --filepath	./file_dir	Directories where the three required files are stored
-c, --cycleLength	The default value is 20	The continuity of synteny blocks
-d, --dustLength	Default value is all species copy number plus 1	It controls the upper limit of gene family. The gene family will be filtered when homologous genes exceeding dustThreshold
-s, --shape	"s" (Default)	Chromosome shape. "s" represents string chromosomes and “c” represents circular chromosomes
-s, --shape	"c"
-m, --model	"manual"	Default is None. When users need to specify the outgroup manually, first use the "manual" mode to generate the node computation order file "model_and_outgroup.txt", and modify the information of the outgroup used. Then use "continue" mode to generate the result based on the problem pieces modified in the previous step
-m, --model	"continue"
"--dotplot"	-	Generate a two-by-two dotplot for each species
"--expand"	-	Expanding synteny block coverage through graph algorithm
--check	"yes" (Default)	Whether to stop the program when the percentage of empty chromosomes is greater than 30% after filtering Synteny blocks by copy number (Stopping the program when the quality of the synteny block is low)
--check	"no"

Result

The "Result" folder will be generated in the run directory, and there are subfolders inside, which are Tree_File, Process_Drimm, and IAGS in the order of generation. The files that the users are primarily concerned about are "IAGS" and "model_and_outgroup.txt" in "Tree_File". "IAGS" is the final generated ancestors genome result and chromosome painting. "model_and_outgroup.txt" is the outgroup information used by each ancestor node, which requires additional processing if manually specified.

The role of each file is described in more detail below:

Tree_File
- species.ratio and all.ratio: Copy number information for all species;
- Evolutionary_tree.txt: Evolutionary tree shape and distribution of all nodes;
- model_and_outgroup.txt: Information about each ancestor node computation in the format: currently computed ancestor node : IAGS computation model : child node : outgroup. If the model is GMP or MultiCopyGMP, there are two child nodes. In particular, in the MultiCopyGMP model, if its outgroup does not have enough copy number, the outgroup chromosome will be doubled manually to compute, denoted by "*N".
Process_Drimm
Essentially an automated process for processDrimm.
- Process_OrthoFind: Gene sequences were generated by coding genes through species GFF and orthogroup.tsv files;
- Drimm_Synteny_Output: Raw results generated after running DRIMM;
- Drimm_Blocks: LCS (Longest Common Subsequence arithmetic, described in the processDrimm) of the raw results from the DRIMM run for downstream analysis;
- Final_Blocks: Filter Drimm_Blocks proportionally to generate blocks that can be run by IAGS.
IAGS
- Name of each ancestor node: Ancestor node details including computed blocks, CRB ratio evaluation etc;
- painting: Chromosome painting of the ancestral genome, where "Painting_start_point.txt" records the basal ancestor of the drawing;
- shufflingEvents.txt: Species fission and fusion information.

Example

1.Quick start
```
iags_auto -f ./example
```
2.Specified parameter
```
iags_auto -f ./example -c 60 -d 12 -s s
```
3.Run with graph expansion algorithm
```
iags_auto -f ./example --expand
```
4.Manual designation of outgroups
```
iags_auto -f ./example -m manual
```
Subsequently change the outgroups in "Result/Tree_File/model_and_outgroup.txt" and continue to run.
```
iags_auto -f ./example -m continue
```
5.Painting dotplot
```
iags_auto -f ./example --dotplot
```
Using this command generates an additional "Dotplot" folder in "Result" where the results will be stored.
6.Close chromosomes check
```
iags_auto -f ./example --check no
```

If you choose to download the source code for use, you will need to replace iags_auto with python IAGS_ATUO.py in the above command to start it by calling the python file directly.

Support

Since IAGS is based on gurobi for integer optimization, this tool requires users to download and activate the gurobi license by themselves, here we provide a help document to help users install and activate gurobi.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.idea		.idea
example		example
utils		utils
IAGS_AUTO.py		IAGS_AUTO.py
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IAGS_AUTO

Download

conda (Recommend)

Source code

Usage

Files

Application

Result

Example

Support

About

Releases

Packages

Languages

License

99gloom/IAGS_AUTO

Folders and files

Latest commit

History

Repository files navigation

IAGS_AUTO

Download

conda (Recommend)

Source code

Usage

Files

Application

Result

Example

Support

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages