Variant calling Snakemake pipeline for analyzing yEvo sequencing data.
-
Make sure you have conda installed.
-
Install Mamba to facilitate snakemake installation, as recommended in the Snakemake docs.
$ conda install -n base -c conda-forge mamba
- Clone this repo:
$ git clone https://github.com/dunhamlab/yevo_pipeline.git
- Create the provided environment using Mamba:
$ cd yevo_pipeline/ && mamba env create -f environment.yml
- Activate the new conda environment:
$ conda activate yevo_pipeline_env
- Download the required pipeline inputs and test sequencing data:
$ ./scripts/download_test_data.sh
- Generate the
run_pipeline.sh
script using the included utility script:
$ ./scripts/gen_run_script.sh
You're ready to run the pipeline!
After following the above installation instructions, run the pipeline on the provided test input files:
$ ./run_pipeline.sh
NOTE: be sure that you are in the repo's base directory with the yevo_pipeline_env
conda environment activated.
To run this pipeline on your own sequencing data, configure runs using run_pipeline.sh
:
FASTQ_DIR
is the absolute path to the raw data (e.g. fastq.gz) directoryOUTPUT_DIR
is the absolute path to your desired output directory, which Snakemake will create
Reference genome, ancestor, and annotation file paths are located in the config/config.yml file and can also be modified as needed.