Stranded in the Best Way
Pre-release
Pre-release
What's new in 0.19.0
New Features
Strand-aware alignment
- Each BAM file can be split into separate files of reads originating from the plus- versus the minus-strand of the RNA using
--sep-strands
. - The option
--f1r2-plus / --f1r2-minus
controls whether paired-end reads whose mate 1 aligns in the forward orientation and mate 2 in the reverse orientation are considered to originate from the plus or the minus strand (for the Illumina library prep kits that our lab uses, they come from the minus strand, so--f1r2-minus
is the default). Reads where mates 1 and 2 align in the reverse and forward orientations, respectively, are considered to originate from the other strand. - For single-end reads, the behavior is the same as for read 1: in
--f1r2-minus
mode, single-end reads that align in the forward orientation are considered to have come from the minus strand. - The option
--minus-label
controls the label appended to the minus strand of each reference (by default, it is the name of the reference followed by-minus
). - In strand-aware mode,
seismic align
also writes a FASTA file of all reference sequences (including their minus strands) whose BAM files received a sufficient number of reads (controlled by--min-reads
) into the same directory as the BAM files and align report, with the same name as the original FASTA file. - Currently, strand-aware alignment is only available through
seismic align
, notseismic wf
. This limitation arises because separating strands actually generates new reference sequences (namely, the minus strands); if those sequences are missing from the FASTA given to therelate
step, then any BAM files aligned to the minus strands will not be able to be processed. It is straightforward to runseismic align
in strand-aware mode, then use the FASTA file it generates as input forseismic relate
orseismic wf
. However, switching the FASTA file automatically withinseismic wf
will require non-trivial re-engineering of how the pipeline works (or some other hacks).
Bug Fixes
- The mechanism to release files to the output directory now keeps a backup of any existing output files until it is sure that the new files have been written. This setup avoids potentially deleting existing output files but then failing to write the new files, causing data loss.
Full Changelog: v0.18.2...v0.19.0