- General
- TRIM Arguments
- ALIGN Arguments
- PROCESS Arguments
- ANALYSE Arguments
- FASTQ Files
- Multiplexed FASTQ Files
- Custom Variant Count File
- Barcoded Library Design
- Trans Library Design
- --runDemo Run the DiMSum Demo (default:F)
- --projectName Project name and directory where results are to be saved (default:'DiMSum_Project')
- --experimentDesignPath Path to Experimental Design File (required if '--runDemo'=F)
- --outputPath Path to directory to use for output files (default:'./' i.e. current working directory)
- --retainIntermediateFiles Should intermediate files be retained? Intermediate files can be many gigabytes, but are required to rerun DiMSum starting at intermediate pipeline stages (default:F)
- --startStage (Re-)Start DiMSum at a specific pipeline stage (default:0)
- --stopStage Stop DiMSum at a specific pipeline stage (default:5)
- --numCores Number of available CPU cores. All pipeline stages make use of parallel computing to decrease runtime if multiple cores are available (default:1)
- --cutadapt5First Sequence of 5' constant region to be trimmed from first (or only) read (optional). Alternatively, both 5' and 3' optional/required constant region sequences can be specified with this argument e.g. '--cutadapt5First'='ACGT;optional...GGCC;required'.
- --cutadapt5Second Sequence of 5' constant region to be trimmed from second read in pair (optional). Alternatively, both 5' and 3' optional/required constant region sequences can be specified with this argument '--cutadapt5Second'='ACGT;optional...GGCC;required'.
- --cutadapt3First Sequence of 3' constant region to be trimmed from first (or only) read (default: reverse complement of '--cutadapt5Second')
- --cutadapt3Second Sequence of 3' constant region to be trimmed from second read in pair (default: reverse complement of '--cutadapt5First')
- --cutadaptMinLength Discard reads shorter than LENGTH after trimming (default:50)
- --cutadaptErrorRate Maximum allowed error rate for trimming constant regions (default:0.2)
- --cutadaptOverlap Minimum overlap between read and constant region for trimming (default:3)
- --cutadaptCut5First Remove fixed number of bases from start (5') of first (or only) read before constant region trimming (optional)
- --cutadaptCut5Second Remove fixed number of bases from start (5') of second read in pair before constant region trimming (optional)
- --cutadaptCut3First Remove fixed number of bases from end (3') of first (or only) read before constant region trimming (optional)
- --cutadaptCut3Second Remove fixed number of bases from end (3') of second read in pair before constant region trimming (optional)
- --vsearchMinQual Minimum Phred base quality score required to retain read or read pair (default:30)
- --vsearchMaxQual Maximum Phred base quality score accepted when reading (and used when writing) FASTQ files; cannot be greater than 93 (default:41)
- --vsearchMaxee Maximum number of expected errors tolerated to retain read or read pair (default:0.5)
- --vsearchMinovlen Discard read pair if the alignment length is shorter than this (default:10)
- --reverseComplement Reverse complement sequences before variant processing? (default:F)
- --wildtypeSequence Wild-type nucleotide sequence (A/C/G/T). Lower-case bases (a/c/g/t) indicate internal constant regions to be removed (required if '--runDemo'=F)
- --permittedSequences Nucleotide sequence of IUPAC ambiguity codes (A/C/G/T/R/Y/S/W/K/M/B/D/H/V/N) with length matching the number of mutated positions (i.e upper-case letters) in '--wildtypeSequence' (default:N i.e. any substitution mutation allowed)
- --sequenceType Coding potential of sequence: either 'noncoding', 'coding' or 'auto'. If the specified wild-type nucleotide sequence ('--wildtypeSequence') has a valid translation without a premature STOP codon, it is assumed to be 'coding' (default:'auto')
- --mutagenesisType Whether mutagenesis was performed at the nucleotide or codon/amino acid level; either 'random' or 'codon' (default:'random')
- --indels Indel variants to be retained: either 'all', 'none' or a comma-separated list of sequence lengths (default:'none')
- --maxSubstitutions Maximum number of nucleotide or amino acid substitutions for coding or non-coding sequences respectively (default:2)
- --mixedSubstitutions For coding sequences, are nonsynonymous variants with silent/synonymous substitutions in other codons allowed? (default:F)
- --fitnessMinInputCountAll Minimum input read count (in all replicates) to be retained during fitness calculations (default:0). Alternatively, thresholds can be applied to variants with specific numbers of nucleotide substitutions as follows 'edit_distance:threshold' e.g. '--fitnessMinInputCountAll'='1:100,2:10,3:10' (unspecified variants are discarded).
- --fitnessMinInputCountAny Minimum input read count (in any replicate) to be retained during fitness calculations (default:0). Alternatively, thresholds can be applied to variants with specific numbers of nucleotide substitutions as follows 'edit_distance:threshold' e.g. '--fitnessMinInputCountAny'='1:100,2:10,3:10' (unspecified variants are discarded).
- --fitnessMinOutputCountAll Minimum output read count (in all replicates) to be retained during fitness calculations (default:0). Alternatively, thresholds can be applied to variants with specific numbers of nucleotide substitutions as follows: 'edit_distance:threshold' e.g. '--fitnessMinOutputCountAll'='1:100,2:10,3:10' (unspecified variants are discarded).
- --fitnessMinOutputCountAny Minimum output read count (in any replicates) to be retained during fitness calculations (default:0). Alternatively, thresholds can be applied to variants with specific numbers of nucleotide substitutions as follows: 'edit_distance:threshold' e.g. '--fitnessMinOutputCountAny'='1:100,2:10,3:10' (unspecified variants are discarded).
- --fitnessNormalise Normalise fitness values to minimise inter-replicate differences (default:T)
- --fitnessErrorModel Fit fitness error model (default:T)
- --fitnessDropoutPseudocount Pseudocount added to output replicates with dropout i.e. variants present in input but absent from output (default:0)
- --retainedReplicates Comma-separated list of (integer) experiment replicates to retain or 'all' (default:'all')
- --fastqFileDir Path to directory containing input FASTQ files (required for WRAP)
- --fastqFileExtension FASTQ file extension (default:'.fastq')
- --gzipped Are FASTQ files gzipped? (default:T)
- --stranded Is the library design stranded? (default:T)
- --paired Is the library design paired-end? (default:T)
- --experimentDesignPairDuplicates Are multiple instances of FASTQ files in the Experimental Design File permitted? (default:F)
Multiplexed FASTQ Files
- --barcodeDesignPath Path to Barcode Design File (tab-separated plain text file with barcode design)
- --barcodeErrorRate Maximum allowed error rate for barcode to be matched (default:0.25)
Custom Variant Count File
- --countPath Path to Variant Count File for analysis with STEAM only (tab-separated plain text file with sample counts for all variants)
- --barcodeIdentityPath Path to Variant Identity File (tab-separated plain text file mapping barcodes to variants)
- --synonymSequencePath Path to Synonym Sequences File (plain text file with one coding nucleotide sequence per line)
- --transLibrary Paired-end reads correspond to distinct molecules? (default:F)
- --transLibraryReverseComplement Reverse complement second read in pair (default:F)