Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kraken not availble inside the docker image #179

Open
cvaske opened this issue Jan 7, 2020 · 2 comments
Open

Kraken not availble inside the docker image #179

cvaske opened this issue Jan 7, 2020 · 2 comments

Comments

@cvaske
Copy link

cvaske commented Jan 7, 2020

bcbio_mv.py install fails due to a lack of the kraken executable in the tooldir. I ran the following installation command:

bcbio_vm.py --datadir /bcbiotest install \
    --data --tools --cores 8 \
    --genomes hg19 \
    --aligners bwa
    --datatarget variation --datatarget battenberg --datatarget kraken --datatarget gemini

And got the stack trace:

Upgrading bcbio-nextgen data files
List of genomes to get (from the config file at '{'genomes': [{'dbkey': 'hg19', 'name': 'Human (hg19)', 'indexes': ['seq', 'twobit'], 'annotations': ['GA4GH_problem_regions', 'capture_regions', 'MIG', 'prioritize', 'dbsnp', 'hapmap', '1000g_omni_snps', 'ACMG56_genes', '1000g_snps', 'mills_indels', 'clinvar', 'cosmic', 'ancestral', 'qsignature', 'genesplicer', 'effects_transcripts', 'varpon', 'vcfanno', 'viral', 'battenberg', 'esp', 'exac', 'gnomad_exome', '1000g', 'transcripts', 'RADAR', 'rmsk', 'fusion-blacklist', 'mirbase'], 'validation': ['giab-NA12878', 'platinum-genome-NA12878', 'giab-NA24385', 'giab-NA24631', 'giab-NA24143', 'giab-NA24149']}], 'genome_indexes': ['bwa', 'rtg'], 'install_liftover': False, 'install_uniref': False}'): Human (hg19)
Running GGD recipe: hg19 esp ESP6500SI-V2
Running GGD recipe: hg19 exac 0.3
Running GGD recipe: hg19 gnomad_exome 2.1.1
Running GGD recipe: hg19 1000g phase3_shapeit2_mvncall_integrated_v5a.20130502
Traceback (most recent call last):
  File "/usr/local/bin/bcbio_nextgen.py", line 228, in <module>
    install.upgrade_bcbio(kwargs["args"])
  File "/usr/local/share/bcbio-nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 106, in upgrade_bcbio
    upgrade_bcbio_data(args, REMOTES)
  File "/usr/local/share/bcbio-nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 354, in upgrade_bcbio_data
    _install_kraken_db(_get_data_dir(), args)
  File "/usr/local/share/bcbio-nextgen/anaconda/lib/python3.6/site-packages/bcbio/install.py", line 616, in _install_kraken_db
    os.path.join(tooldir, "bin", "kraken"))
argparse.ArgumentTypeError: kraken not installed in tooldir /usr/local/bin/kraken.
' returned non-zero exit status 1.

It appears that there is indeed no kraken executable in the docker image:

charlie@box:~$ docker run -it quay.io/bcbio/bcbio-vc find / -name kraken
charlie@box:~$

Is there a way to use kraken via the Docker container?

@chapmanb
Copy link
Member

chapmanb commented Jan 9, 2020

Charles;
Apologies about the issue. Some of the programs you're looking to use (Kraken, Battenberg and Gemini) are pretty data intensive and not great fits for Docker, so we haven't integrated those to work with it. Those would best be used with a standard bcbio install not using Docker. Apologies for not supporting this and thanks for trying it out.

@cvaske
Copy link
Author

cvaske commented Jan 9, 2020

Got it, that seems reasonable. Is this due to slower file I/O in Docker, or just to keep the container image size from getting too large?

Would you be interested in a pull request to update the documentation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants