Crackling is one of the leading CRISPR-Cas9 guide RNA design tools.
In this implementation of Crackling, we use serverless technologies by Amazon Web Services (AWS) so anyone can design high-quality gRNA without having to send their data to a third-party.
With thanks to our colleagues at the CSIRO for their support during the development of this edition of the pipeline.
For support, contact Jake Bradford.
The International Conference for High Performance Computing, Networking, Storage, and Analysis (Supercomputing) 2024
... in the Workshop: "WHPC: Diversity and Inclusion for All" (abstract)
Event-driven high-performance cloud computing for CRISPR-Cas9 guide RNA design
Divya Joy1, Jacob Bradford1
1 Queensland University of Technology, Brisbane, Australia
The Annual Conference of the Australian Bioinformatics and Computational Biology Society 2020
CRISPR, faster, better - The Crackling method for whole-genome target detection
Jacob Bradford1, Timothy Chappell1, Brendan Hosking2, Laurence Wilson2, Dimitri Perrin1
1 Queensland University of Technology, Brisbane, Australia
2 Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, Australia
Please cite our paper when using Crackling:
Bradford, J., Chappell, T., & Perrin, D. (2022). Rapid whole-genome identification of high quality CRISPR guide RNAs with the Crackling method. The CRISPR Journal, 5(3), 410-421.
The standalone implementation is available on GitHub, here.
-
If you do not have an AWS account, follow this AWS user guide
-
Clone this repository, or download a Zip copy from GitHub.
-
Follow the deployment instructions below.
-
Access your deployment of Crackling Cloud using the generated URLs:
-
The CloudFront URL provides you access to a simple web interface to submit jobs and retrieve results.
-
The API endpoint URL provides you access to same features of the web interface but allows you to write custom scripts or use third-party tools to interface with your deployment of Crackling Cloud.
For example,
CloudfrontURL: d123q1z2zzz999.cloudfront.net CracklingRestApiEndpoint: https://e123456789.execute-api.ap-southeast-2.amazonaws.com/prod/
-
-
Access the web interface using the CloudFront URL.
-
Submit a job with these details (provided as defaults):
Query sequence:
ATCGATCGATCGATCGATCGAGGATCGATCGATCGATCGATCGTGGCCAATCGATCGATCGATCGATCG
Genome Accession:
GCA_000482205.1
-
After submitting the job, the interface will automatically switch to the 'retrieve results' tab. Click on the green 'Retrieve Results' button, progressively, until all results are ready. The status indicator will how analysis is progressing:
Identified 3 candidate guides Completed efficiency evaluation for 0 guides Completed specificity evaluation for 0 guides
The sample inputs will generate three guide RNA.
-
Start, end and strand describe where the guide RNA are found along the input gene sequence.
-
The guide RNA itself is the sequence
-
Consensus results reflects the predictive efficiency of the guide RNA. See the 'About' tab for more information. You should use guides that have scored at least two out of three.
-
Off-target score reflects the predicted specificity of the guide RNA. See the 'About' tab for more information. You should use guides that have scored at least 75 out of 100.
-
Coming soon: a short-cut method for deploying this infrastructure to your cloud account!
For now, read the Development instructions section.
Be sure you have cloned this repository to your computer.
1. Install the AWS command-line interface
Follow the AWS Documentation for Getting started with the AWS CLI
2. Install the AWS Cloud Development Kit
Follow the AWS Documentation for Getting started with the AWS CDK
3. Shared objects (for binaries)
Collect all shared objects needed by compiled binaries.
Working in the root directory of the repo, run:
ldd layers/isslScoreOfftargets/isslScoreOfftargets | grep "=> /" | awk '{print $3}' | xargs -I '{}' cp -v '{}' layers/sharedObjects
then
ldd layers/rnaFold/rnaFold/RNAfold | grep "=> /" | awk '{print $3}' | xargs -I '{}' cp -v '{}' layers/sharedObjects
**4. Python Modules **
The `pip install -r' command is used frequently throught the following section. In some enviroments, this command errors out. If this occours, please view the requirments.txt file (referenced in the command) and use pip to install each library manually.
PartLoader Layer
Working in the root directory of the repository, run:
mkdir -p ./layers/requestsPy310Pkgs/python
python3 -m pip install --target layers/requestsPy310Pkgs/python requests
Consensus Layer
The consensus layer has Python dependencies, including scikit-learn. Scikit-learn along with its dependencies is over 250MB. To overcome the 250MB Lambda layer limit, these dependencies are being installed directly within the consensus Lambda and uploaded to S3. Hence, they do not need to be installed locally before deployment.
If you make changes to the dependencies, make sure the requirements.txt
file is updated:
cd modules/consensus
pip freeze > requirements.txt
NCBI Layer:
Working in the root directory of the repo, run:
mkdir -p layers/ncbi/python
python3 -m pip install --target layers/ncbi/python -r layers/ncbi_reqs.txt
AWS App Modules
Working in the <root>/aws
directory:
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
deactivate
5. Further Reading
Please now proceed to read the following documentation for futher install instructions (/understanding) for the application:
<root>/layers/README.md
<root>/modules/README.md
<root>/aws/README.md
6. Deploying using the CDK
Working from the <root>/aws
directory:
# Run this during first deployment
cdk bootstrap aws://377188290550/ap-southeast-2
# Useful CDK commands include:
cdk synth # for creating the CloudFormation template without deploying
cdk deploy # for deploying the stack via CloudFormation
cdk destroy # for destroying the stack in CloudFormation
# add the `--profile` flag to indicate which set of AWS credentials you wish to use, e.g. `--profile bmds`.