remote data on S3 #172

matthdsm · 2019-01-29T12:13:48Z

Hi Brad,

Quick question. The commit history shows "improved support for data on AWS". Could you elaborate a bit on this?

We're looking into decentralizing all of our data to (self-hosted) S3 repo's powered bij minio and CephFS RADOS gateway.

This means all fastq data and all reference data (e.g. the complete genomes dir) are hosted on a S3 url. What's the best way to configure bcbio to leverage this?
How do we configure S3 fastq input and S3 hosted reference data (if possible).

Thanks for the help
Cheers
M

The text was updated successfully, but these errors were encountered:

Documents work in progress for AWS Batch support with Cromwell. It's not yet working pending improvements to Cromwell, but documents setup and current status. bcbio/bcbio-nextgen-vm#172

chapmanb · 2019-01-30T13:02:04Z

Matthias;
Thanks for looking into this. This is still work in progress but we're working on supporting CWL runs on AWS Batch using Cromwell. It's not yet functional. but here is the work in progress documentation so you can see what we've got in place:

https://bcbio-nextgen.readthedocs.io/en/latest/contents/cloud.html#amazon-web-services-aws-batch

Practically, it sounds like you don't need AWS batch and would instead just want to build inputs from S3-like buckets and then run them on your own infrastructure. This should work with the current CWL and Cromwell. You'd create an s3: configuration block in your input bcbio_system.yaml as described in the docs and then it should stage down files from there for running on your local cluster and shared filesystem.

I'd definitely welcome feedback and reports if you test this out. Thanks again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remote data on S3 #172

remote data on S3 #172

matthdsm commented Jan 29, 2019 •

edited

Loading

chapmanb commented Jan 30, 2019

remote data on S3 #172

remote data on S3 #172

Comments

matthdsm commented Jan 29, 2019 • edited Loading

chapmanb commented Jan 30, 2019

matthdsm commented Jan 29, 2019 •

edited

Loading