feat: add optional column for adapters (#419)

* add optional adapters column * add missing .markdownlint.yaml * update test config * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * Update docs/user-guide/configuration.md Co-authored-by: Alexander Thomas <[email protected]> * run pre-commit Co-authored-by: Alexander Thomas <[email protected]>
IKIM-Essen · Jan 3, 2022 · f883ce2 · f883ce2
1 parent dab5879
commit f883ce2
Show file tree

Hide file tree

Showing 2 changed files with 24 additions and 32 deletions.
diff --git a/docs/user-guide/configuration.md b/docs/user-guide/configuration.md
@@ -7,13 +7,13 @@ from the raw data.
 
 ### Config File
 
-The adapter sequences used can be entered in the config file under
+The adapter sequences used can be specified in the config file under
 `preprocessing` -> `kit adapters`.
 
 For **paired-end data**, the adapters can be detected by per-read overlap
-analysis, which seeks the overlap of each pair of reads. The adapter sequences
-can specify the adapter sequences for read one by `—adapter_sequence` and for
-read two by`—adapter_sequence_r2`. An example is for [Illuminas TruSeq library] (<https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-rna-v2.html>)
+analysis, which seeks the overlap for each pair of reads. The adapter sequences
+can be specified for read one by `—adapter_sequence` and for
+read two by`—adapter_sequence_r2`. An example for [Illuminas TruSeq library] (<https://www.illumina.com/products/by-type/sequencing-kits/library-prep-kits/truseq-rna-v2.html>)
 is shown below:
 
 ```yaml
@@ -31,7 +31,7 @@ Adapters for **single-end data** can be specified only using the
 ### Sample Sheet
 
 The second way to remove adapter sequences is to specify the adapter sequence
-directly sample in the sample sheet. The adapters must enter it in a column
+per sample in the sample sheet. The adapters must be entered in a column
 called `adapters`. For paired-end and single-end format, see above. Here is
 an exemplary samples sheet:
 
@@ -40,8 +40,8 @@ an exemplary samples sheet:
 | example-1   | PATH/TO/fq1 | PATH/TO/fq2 | 1970-01-01 | 1                | illumina   | --adapter_sequence=ACGT --adapter_sequence_r2=TGCA |
 | example-2   | PATH/TO/fq  |             | 1970-01-01 | 1                | ion        | --adapter_sequence=ACGT                            |
 
-If an adapter sequence is entered into the sample sheet for one sample, this
-adapter sequence is obviously used to trim the sequences of this sample. For
+If an adapter sequence is specified for a sample in the sample sheet, this
+adapter sequence is used to trim the sequences of only this sample. For
 empty entries, UnCoVar uses the adapter sequence from the config file.
 
 ### Pre-Defined Adapters
@@ -52,7 +52,7 @@ namely:
 1. [Revelo RNA-Seq library preparation kit](https://lifesciences.tecan.com/revelo-rna-seq-library-prep-kit?p=tab--5)
 1. [EasySeq RC-PCR SARS CoV-2 Whole Genome Sequencing kit](https://www.nimagen.com/shop/products/rc-cov096/easyseq-sars-cov-2-novel-coronavirus-whole-genome-sequencing-kit)
 
-The `adapters` column in the sample sheet is used to trim the adapters sequences
+The `adapters` column in the sample sheet is used to trim the adapter sequences
 of these kits. Revelo adapters are trimmed by specifying
 `revelo-rna-seq` in the column per sample, while the Nimagen adapters are
 removed by specifying `nimagen-easy-seq`. A short example:

diff --git a/workflow/schemas/config.schema.yaml b/workflow/schemas/config.schema.yaml
@@ -39,30 +39,22 @@ properties:
             description: minimal length of acceptable reads for illumina reads
           min-PHRED:
             type: integer
-            description: average quality of acceptable reads for illumina reads
-      ont:
-        properties:
-          min-length-reads:
-            type: integer
-            description: minimal length of acceptable reads for  Oxfort Nanopore reads
-        min-PHRED:
-          type: integer
-          description: average quality of acceptable reads for Oxfort Nanopore reads
-      min-identity:
-        type: number
-        description: identity to virus reference genome of reconstructed sequence
-      max-n:
-        type: number
-        description: share N in the reconstructed sequence
-      min-depth-with-PCR-duplicates:
-        type: number
-        description: minimum local sequencing depth without filtering of PCR duplicates
-      min-depth-without-PCR-duplicates:
-        type: number
-        description: minimum local sequencing depth after filtering PCR duplicates
-      min-allele:
-        type: number
-        description: minimum informative allele frequency
+            description: average quality of acceptable reads for Oxfort Nanopore reads
+        min-identity:
+          type: number
+          description: identity to virus reference genome of reconstructed sequence
+        max-n:
+          type: number
+          description: share N in the reconstructed sequence
+        min-depth-with-PCR-duplicates:
+          type: number
+          description: minimum local sequencing depth without filtering of PCR duplicates
+        min-depth-without-PCR-duplicates:
+          type: number
+          description: minimum local sequencing depth after filtering PCR duplicates
+        min-allele:
+          type: number
+          description: minimum informative allele frequency
   preprocessing:
     properties:
       kit-adapters: