Nextflow- Metasociomic analysing bacterial social interactions in metagenomes of longitudinally collected samples of human microbiomes.
The project aims to identify and analyze siderophore operons in metagenomic data, using the data from Andersen et al. 2015. For this purpose, an automated pipeline will be built with the possibility of customization in certain parameters : Quality analisis, mapping and variant calls on bacterial metagenomes. The pipeline take input: fastqs, followed by hard filtering and outputs: html, bams, and gvcfs for joint genotyping.
The following pipeline was constructed using Nextflow and Docker to better track the steps and provide an easy way to generate a final document.
The pipeline requires NextFlow and Docker on the target system. These are ofthen pre-intsalled on HPC systems.
Nextflow - if you do not have install Nextflow, use the
Docker:
The recommended way is to clone it from github:
git clone https://github.com/jimmlucas/wgs-metasociomic.git
cd wgs-metasociomic
also reommended that you pre-pull the Docker imagen required by the workflow, there is a script "Dockerfile" in the directory. Remind run the following comand in the root:
docker pull ghcr.io/jimmlucas/wgs:short_reads
or
runing the pipeline with the following command:
''' nextflow run run.nf -with-docker jimmlucas/dvt:wgs '''
Download
Sometimes we forget how to download masive reads from a DB. I am here to save your time and have provided a script for automatic-downloading. However, if you prefer to use your own date or you already have the reads downloaded, ignore this step and proceed to the next one.
Run the script in the root using:
bash ./workflow/bin/download_reads.sh
Remmeber that if you want to use the automatic-downloading, you need to have the acceslist already prepare it and provide the full path of the file:
S. Andersen, J.Schluter "A metagenomics approach to investigate microbiome sociobiology", 2021
S. Brush -"Read trimming has minimal effect on bacterial SNP-calling accuracy"