RNAseq pipeline

a RNAseq quantification pipeline from .fastq files to a gene counts matrix using kallisto for coding and non-coding RNA.

set up pipeline

before running, you have to set up the attached Docker image (will take ~30 min):

docker build -t rnaseq-pipeline https://raw.githubusercontent.com/loipf/RNAseq-pipeline/master/docker/Dockerfile

now either replace the Docker container hash (last output line from previous build command) in nextflow.config or run nextflow with the -with-docker rnaseq-pipeline argument.

run quantification pipeline

no pre-processing or quality improvement is performed and must be done by the user! (check file kallisto_aligned_reads_qc.csv for p_pseudoaligned >70% and DESeq2_size_factor around 0.6-1.4). sums up transcripts to gene_symbols (without haplotype and scaffold genes).

it can be run locally with downloaded github-repo and edited nextflow.config file with:

nextflow run main.nf

or

nextflow run loipf/RNAseq-pipeline -r main --project_dir /path/to/folder --reads_dir /path/to/samples --ensembl_release 101 --num_threads 10 -with-docker rnsaseq-pipeline

for this execution to work properly, you have to be in the current project directory.

nextflow optional extendable with:

-resume
-with-report report_RNAseq-pipeline
-with-timeline timeline_RNAseq-pipeline
-w work_dir

pipeline optional extendable with:

--num_threads 5
--ensembl_release 101
--include_ncrna true   # false
--nextflow_stageInMode symlink  # copy

by default, all output will be saved into the data folder of the current directory. best to run with a new clear folder structure as not all new results do overwrite old ones.

check quality reports in data/quality_reports to exclude problematic samples.

additional, an 3' and 5' adapter sequence (file) needs to be specified with the nextflow arguments --adapter_3_seq_file [sequence|file.fasta] and --adapter_5_seq_file [sequence|file.fasta] or in the main.nf file. otherwise two empty files named NO_FILE and NO_FILE2 must be created to make this work (needs to be fixed someday). if a file is provided, it must be structured like the following example:

> adapter_3_batch_01
AANTGG
> adapter_3_batch_02
GATCGG

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
bin		bin
docker		docker
.gitignore		.gitignore
README.md		README.md
main.nf		main.nf
modules.nf		modules.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RNAseq pipeline

set up pipeline

run quantification pipeline

About

Releases

Packages

Languages

loipf/RNAseq-pipeline

Folders and files

Latest commit

History

Repository files navigation

RNAseq pipeline

set up pipeline

run quantification pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages