Skip to content

salmon 1.2.1 release notes

Compare
Choose a tag to compare
@rob-p rob-p released this 22 Apr 14:34
· 325 commits to master since this release

This is a minor release, but it nonetheless adds a few important features and fixes an outstanding bug.

This release incorporates all of the improvements and additions of 1.2.0, which are significant and which are covered in detail here.

New features:

  • salmon learned a new command line option --mismatchSeedSkip. This option can be used to tune seeding sensitivity for selective-alignment . The default value is 5, and should work well in most cases, but this can be tuned if the user wants. After a k-mer hit is extended to a uni-MEM, the uni-MEM extension can terminate for one of 3 reasons; the end of the read, the end of the unitig, or a mismatch. If the extension ends because of a mismatch, this is likely the result of a sequencing error. To avoid looking up many k-mers that will likely fail to be located in the index, the search procedure skips by a factor of mismatchSeedSkip until it either (1) finds another match or (2) is k-bases past the mismatch position. This value controls that skip length. A smaller value can increase sensitivity, while a larger value can speed up seeding.

  • salmon learned about the environment variable SALMON_NO_VERSION_CHECK. If this environment variable is set (to either 1 or TRUE) then salmon will skip checking for an updated version, regardless of whether or not it is passed the --no-version-check flag on the command line. This makes it easy to e.g. set the environment variable to control this behavior for instances running on a cluster. This addresses issue 486, and we thank @cihanerkut for the suggestion.

Improvements:

  • This is a change in default behavior: As raised in issue 505, salmon would not index sequence with duplicate decoy entries, unless the --keepDuplicates flag was passed. Instead, salmon would refuse to index these sequences until the duplicate decoys were removed. Since indexing duplicate sequences does not make any sense, we have decided that duplicate decoy sequences will always be discarded (regardless of the status of the --keepDuplicates flag). This lifts the burden on the user of having to ensure that the decoy sequences are free of duplicates. The behavior can now be described as: "If a decoy sequence is a duplicate of any previously-observed sequence, it is discarded, regardless of the status of the --keepDuplicates flag." This applies equally well if the decoy is a duplicate of a previously-observed decoy or if it is a duplicate of a non-decoy target sequence. Essentially, any decoy sequence that is a duplicate of a previously-observed sequence (decoy or not) will be discarded. The number of observed duplicate decoys (if > 0) will be reported to the log. Thanks to @tamuanand for raising the issue that led to this improvement.

  • During the build process, salmon (and pufferfish) now check directly if std::numeric_limits<_int128> is defined or not, and set the pre-processor flags accordingly. This should address an issue that was reported building under clang on OSX 10.15 (seemingly, earlier versions of the compiler turned on vendor-specific extensions under the -std=c++14 flag, while the newer version does not).

Bug fixes:

  • Addressed / fixed a possibly un-initialized variable (sopt.noSA) in argument parsing.