Skip to content

Sailfish v0.8.0

Compare
Choose a tag to compare
@rob-p rob-p released this 16 Nov 04:35
· 72 commits to master since this release

This release brings with it minor bug-fixes and two significant new features.

Bug-fix

  • Fixed a bug where the computed mapping rate (output in a comment at the top of quant.sf) could slightly over-estimate the true mapping rate (i.e. the sum of the estimated counts divided by the number of observed fragments).
  • Fixed a bug that prevented some messages from being written to the log prior to exit (when errors were encountered in processing).

New Features

  • Support for flexible handling of stranded libraries. This includes two new options --enforceLibCompat and --ignoreLibCompat. The default behavior --- when neither of these flags are specified --- is the following. When a fragment is mapped, all multi-mappings are checked for compatibility with the specified library format type. If any mappings are compatible, then all incompatible mappings are discarded. However, if no compatible mappings are found, then the incompatible mappings will be counted.
    • When the --enforceLibCompat flag is passed, then only compatible fragments will ever be considered. Thus, if there are no compatible mappings for a fragment but incompatible mappings exist, the fragment will be considered as if it has no mappings.
    • When the --ignoreLibCompat flag is passed, then all mappings are considered compatible. This effectively disables testing compatibility of mappings with the specified format.
  • Large quasi-index support has been added. Now, when building the index, Sailfish will determine if a 32-bit suffix array is sufficient or if a 64-bit suffix array is required. It will build and use the appropriate suffix array (and report the result to the log). Note: The indexing code is generic, but the 64-bit index has been tested much less than the 32-bit index.
  • Improvement to the manner in which gene lengths are calculated when aggregating transcript-level results to the gene level. If at least one transcript of a gene is expressed, the gene length is computed as the (expression) weighted sum of the lengths of the expressed transcripts. If no transcript of a gene is expressed (i.e. the TPM of all of its transcripts are 0), then the length is reported as the average transcript length. This improves upon the prior rule of simply reporting the gene length as the length of the gene's longest transcript.