From 7396d4ccfd26d573269f32a1592ac20137ccf833 Mon Sep 17 00:00:00 2001 From: Alfred Kedhammar Date: Thu, 5 Dec 2024 15:11:32 +0000 Subject: [PATCH] update docs --- docs/output.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/output.md b/docs/output.md index a2706a8..dc26159 100644 --- a/docs/output.md +++ b/docs/output.md @@ -56,7 +56,16 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d [Fastqscreen](https://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/) allows you to set up a standard set of libraries against which all of your sequences can be searched. Your search libraries might contain the genomes of all of the organisms you work on, along with PhiX, Vectors or other contaminants commonly seen in sequencing experiments. -It requires the supply of referenced (databases) in a config file. In order to parallelize the mapping of the different samples, in seqinspector, this a fastqscreen config file is generated for every sample/reference combination. +It requires a `.csv` detailing: + +- the working name of the reference +- the name of the aligner used to generate its index (which is also the aligner and index used by the tool) +- the file basename of the reference and its index (e.g. the reference `genoma.fa` and its index `genome.bt2` have the basename `genome`) +- the path to a dir where the reference and index files both reside. + +See `assets/example_fastq_screen_references.csv` for example. + +The `.csv` is provided as a pipeline parameter `fastq_screen_references`. The `.csv` is used to construct a `FastQ Screen` configuration file within the context of the process work directory in order to properly mount the references. ### SeqFu Stats