A tool for counting exact K-mer occurrences in a DNA or RNA sequence very, very quickly (where K=32).
tallyman --rna <haystack> --dna <needles> -o <output>
- haystack is a FASTX file of sequences to be searched
- needles are a FASTX file of 32-mers to be searched for
Tallyman is implemented in the Rust programming language. Tooling instructions are below. They assume you already have the Rust toolchain installed. To do this, see https://rustup.rs.
- Run unit tests:
cargo test
- Run the demo:
cargo run
- Create a release build (faster):
cargo build --release
, the binary will end up intarget/release/
- Format the code (do this before pushing):
cargo fmt
To run the benchmarks, you will need to install hyperfine.
On a Mac this can be done through Homebrew using brew install hyperfine
.
You can also use the setup-mac
make target: make setup-mac
.
Benchmarks may then be run with make benchmark
.
The default benchmark searches a file with 1 million auto-generated sequences for 999 auto-generated 32-mers.
- Sarah Walling [email protected]
- Travis Wheeler [email protected]
- George Lesica [email protected]
- Ken Youens-Clark [email protected]