-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Building taxonomy (#38) * building taxonomy files but this script will be deprecated right away * deprecated * script to build taxonomy with src files * m * move old taxonomy to deprecated * remove old 'versioned' files outside of git versioning * filter taxonomy script * complete the taxonomy * updated scripts for compiling databases * dev branch testing * fix lmono test a bit * . * Fix the taxonomy tests (#39) * building taxonomy files but this script will be deprecated right away * deprecated * script to build taxonomy with src files * m * move old taxonomy to deprecated * remove old 'versioned' files outside of git versioning * filter taxonomy script * complete the taxonomy * updated scripts for compiling databases * dev branch testing * fix lmono test a bit * . * fix paths * updated PATH * updated PATH * troubleshooting * fix PATH again * fix ls path * remove that step * updated tests to reflect build-taxonomy (#40) * fix path to taxonomy files * download and build taxonomy * merge Listeria into Yersinia matrix * m * updated output directory as matrix.GENUS * kraken1 tests patches * m * Fixed two more tests (#41) * update yml * query fallback * debugging msg * fix path to taxonomydb * print first two lines of fasta files * helpful cut statement * remove head statement in last step * bump version
- Loading branch information
Showing
21 changed files
with
273 additions
and
445 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
on: | ||
push: | ||
branches: [master] | ||
branches: [master, dev] | ||
name: Pull-down-all-accessions | ||
|
||
jobs: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
edirect | ||
share |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,20 +39,22 @@ using your own email address instead of `[email protected]`. | |
|
||
## Download instructions | ||
|
||
For usage, run `perl bin/downloadKalamari.pl --help` | ||
First, build the taxonomy. | ||
The script `buildTaxonomy.sh` uses the diffs in Kalamari to enhance the default NCBI taxonomy. | ||
Next, `filterTaxonomy.sh` reduces the taxonomy files to just those found in Kalamari. | ||
`filterTaxonomy.sh` uses `taxonkit` and so this needs to be in your | ||
environment before starting. | ||
|
||
SRC=Kalamari | ||
perl bin/downloadKalamari.pl -o $SRC src/chromosomes.tsv | ||
bash bin/buildTaxonomy.sh | ||
bash bin/filterTaxonomy.sh | ||
|
||
### ...with plasmids | ||
To download the chromosomes and plasmids, use the `.tsv` files, respectively, with `downloadKalamari.pl`. | ||
Run `downloadKalamari.pl --help` for usage. | ||
However, to download the files to a standard location, | ||
please simply use `downloadKalamari.sh` which uses | ||
`downloadKalamari.pl` internally. | ||
|
||
SRC=Kalamari | ||
perl bin/downloadKalamari.pl -o $SRC src/chromosomes.tsv src/plasmids.tsv | ||
|
||
### taxonomy | ||
|
||
The taxonomy files `nodes.dmp` and `names.dmp` are under `src/taxonomy-VER` | ||
where `VER` is the version of Kalamari. | ||
bash bin/downloadKalamari.pl | ||
|
||
## Database formatting instructions | ||
|
||
|
@@ -80,4 +82,4 @@ Please see [CONTRIBUTING.md](CONTRIBUTING.md) | |
|
||
## Citation | ||
|
||
Please refer to the ASM 2018 poster under docs | ||
Please refer to the ASM 2018 poster under docs. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#!/bin/bash | ||
|
||
set -eu | ||
|
||
thisdir=$(dirname $0) | ||
KALAMARI_VER=$(downloadKalamari.pl --version) | ||
|
||
sharedir=$thisdir/../share/kalamari-$KALAMARI_VER | ||
SRC="$sharedir/kalamari" | ||
TAXDIR="$sharedir/taxonomy/filtered" | ||
|
||
# Test prereqs | ||
which kraken-build | ||
which jellyfish | ||
|
||
DB="$sharedir/kalamari-kraken1" | ||
mkdir -pv $DB | ||
cp -rv $TAXDIR $DB/taxonomy | ||
find $SRC -name '*.fasta' \ | ||
-exec kraken-build --db $DB --add-to-library {} \; | ||
kraken-build --db $DB --build --threads 1 | ||
kraken-build --db $DB --clean |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#!/bin/bash | ||
|
||
set -eu | ||
|
||
thisdir=$(dirname $0) | ||
KALAMARI_VER=$(downloadKalamari.pl --version) | ||
|
||
sharedir=$thisdir/../share/kalamari-$KALAMARI_VER | ||
SRC="$sharedir/kalamari" | ||
TAXDIR="$sharedir/taxonomy/filtered" | ||
|
||
# Test prereqs | ||
which kraken2-build | ||
which jellyfish | ||
|
||
DB="$sharedir/kalamari-kraken2" | ||
mkdir -pv $DB | ||
cp -rv $TAXDIR $DB/taxonomy | ||
find $SRC -name '*.fasta' \ | ||
-exec kraken2-build --db $DB --add-to-library {} \; | ||
kraken2-build --db $DB --build --threads 1 | ||
kraken2-build --db $DB --clean |
Oops, something went wrong.