Python package for fCAT, a feature-aware Completeness Assessment Tool
fCAT tool is distributed as a python package called fcat. It is compatible with Python ≥ v3.7.
You can install fcat using pip
:
# python3 -m pip install fcat
python3 -m pip install git+https://github.com/BIONF/fCAT
or, in case you do not have admin rights, and don't use package systems like Anaconda to manage environments you need to use the --user
option:
# python3 -m pip install --user fcat
python3 -m pip install --user git+https://github.com/BIONF/fCAT
and then add the following line to the end of your ~/.bashrc or ~/.bash_profile file, restart the current terminal to apply the change (or type source ~/.bashrc
):
export PATH=$HOME/.local/bin:$PATH
The complete process of fCAT can be done using one function fcat
fcat --coreDir /path/to/fcat_data --coreSet eukaryota --refspecList "HOMSA@9606@2" --querySpecies /path/to/query.fa [--annoQuery /path/to/query.json] [--outDir /path/to/fcat/output]
where eukaryota is name of the fCAT core set (equivalent to BUSCO set); HOMSA@9606@2 is the reference species from that core set that will be used for the ortholog search; query is the name of species of interest. If --annoQuery
not specified, fCAT fill do the feature annotation for the query proteins using FAS tool.
You will find the output in the /path/to/fcat/output/fcatOutput/eukaryota/query/ folder, where /path/to/fcat/output/ could be your current directory if you not specified --outDir
when running fcat
. The following important output files/folders can be found:
- all_summary.txt: summary of the completeness assessment using all 4 score modes
- fdogOutput.tar.gz: a zipped file of the ortholog search result
- mode_1, mode_2, mode_3 and mode_4: detailed output for each score mode
- phyloprofileOutput: output phylogenetic profile data that can be used with [PhyloProfile tool](https://github.com/BIONF/PhyloProfile)
fCAT algorithm consists of 3 main steps:
- Calculate group-specific cutoffs for a core set
Please make sure that the R dependencies are available before running this function.
fcat.cutoff --coreDir /path/to/fcat_data --coreSet eukaryota
- Search for orthologs in a gene set of interst and create phylogenetic profiles
fcat.ortho --coreDir /path/to/fcat_data --coreSet eukaryota --refspecList "HOMSA@9606@2" --querySpecies /path/to/query.fa --annoQuery /path/to/query.json
- Create report for completeness assessment
fcat.report --coreDir /path/to/fcat_data --coreSet eukaryota --outDir /path/to/fcat/output --queryID queryID --mode 1
Any bug reports or comments, suggestions are highly appreciated. Please open an issue on GitHub or be in touch via email.
For further support or bug reports please contact: [email protected]