- DiscountPy is completely Python based. To run it on your machine,
python >= 3.9
is required.- Poetry is a tool for dependency management and packaging in Python. Make sure to install it.
- Try to run all the commands in
Powershell
-
Installation: You can download the zip file or clone it.
-
Setting up: First install all the dependencies and create
virtual environment
. To do so, run the following commands in workspace terminal.poetry install poetry update
-
Now configure python interpreter. For configuring, first get the
env
path. To get theenv
information, run the following command.poetry env info
-
Or to know only path, run
poetry env info --path
-
To know more about poetry, follow Poetry
Now DiscountPy
is ready to be run.
DiscountPy
is a k-mer counting tool, it gives you three orderings
for counting the k-mers.
-k
: Length of the k-mer-m
: Width of the minimizers-f
: Input dataset (.fasta)-o
: Order (lex | freq)--minimizers
: Universal minimizer set
-
To get the hashed super-mers with minimizers
-
By lexicographically ordered
discount -k 28 -o lex -f data/SRR094926.fasta
or
discount -k 28 -m 10 lex -f data/SRR094926.fasta
-
By frequency ordering
discount -k 28 -f data/SRR094926.fasta
or you can skip the
-o
in frequency order as default value is-o freq
discount -k 28 -o freq -f data/SRR094926.fasta
-
By universal frequency ordering
discount -k 28 -f data/SRR094926.fasta --minimizers PASHA/pasha_all_28_10.txt
or
discount -k 28 -o freq -f data/SRR094926.fasta --minimizers PASHA/pasha_all_28_10.txt
-
-
To generate file of the hashed super-mers:
discount -k 28 -o freq -f data/SRR094926.fasta --minimizers PASHA/pasha_all_28_10.txt --output output/xxx.txt
-
At finally getting the counts of the
k-mers
:-
You have to sort the above generated file externally and input that file.
discount -k 28 -f sortedXYZ.txt --count directory/to/counted-kmer-file
-