Skip to content

Releases: rouskinlab/seismic-rna

Cluster Clones

15 Nov 13:42
Compare
Choose a tag to compare
Cluster Clones Pre-release
Pre-release

What's new in v0.9.4

Bugfixes

  • Prior to this release, on some but not all platforms, running multiple EM clustering runs in parallel (but not in series) would cause them to have identical trajectories. I suspect (but have not demonstrated) that this bug happened because on such platforms, the entire Python state (including the random number generator) was copied to each subprocess by concurrent.futures.ProcessPoolExecutor. Further suggesting this mechanism, on the same platforms, subprocesses also log messages the same way as the main process, suggesting that the root logger is also copied. So that each clustering run has a unique trajectory even with the same Python state, each clustering run now accepts a seed for its random number generator, which is randomized by the main process to ensure that each clustering run is seeded uniquely.

Internals

  • Adopt the convention where all strings use double quotation marks.
  • Add more detailed __str__ methods to some custom classes.

Documentation

  • Add more information to the documentation, especially to Manuals -> Workflow.

Full Changelog: v0.9.3...v0.9.4

WebApp Export

10 Nov 00:39
Compare
Choose a tag to compare
WebApp Export Pre-release
Pre-release

What's new in 0.9.3

New Features

  • Export mutational and structural data and metadata for the web app using seismic export web [-S sample-metadata.csv] [-R reference-metadata.csv] [samples ...]

Bugfixes

  • BAM/CRAM files with insufficient reads are no longer returned in the list of output files from the align step.

Full Changelog: v0.9.2...v0.9.3

Memory Managed

09 Nov 00:44
Compare
Choose a tag to compare
Memory Managed Pre-release
Pre-release

What's new in v0.9.2

Bugfixes

  • Serious memory leaks caused by cached instance methods of the MutsBatch class (since v0.9.0) have been fixed (credit to @justinaruda).

Internals

  • A new Header class has been introduced to handle parsing and formatting table headers for types of relationships and/or clusters, along with a test suite for it.
  • Both branches for demultiplexing (credit to @heWhosShouldersBlockTheSun) have been merged into the main branch.

What's Changed

New Contributors

Full Changelog: v0.9.1...v0.9.2

Underworld of the Unsigned

25 Oct 00:13
Compare
Choose a tag to compare
Pre-release

What's new in 0.9.1

Bug fixes

  • Fixed a bug in seismicrna.core.batch.index: calling np.full(target.max(initial=-1) + 1, -1) where target is a NumPy NDArray of unsigned integer type would implicitly convert -1 to an unsigned integer with the maximum value of its data type (e.g. 4,294,967,295 for a 32-bit integer) and attempt to allocate memory for an enormous array of this size. On some systems, this would cause a crash, and on others would simply waste time allocating the memory.

Speed by Sparsity

24 Oct 20:46
Compare
Choose a tag to compare
Speed by Sparsity Pre-release
Pre-release

What's new in 0.9.0

Performance upgrades

  • Mutation data is now processed and saved in a sparse format that tracks only the mutated positions. Since mutations make up only 1 - 5% of most datasets, the sparse format is more storage-, memory-, and time-efficient.
  • Batches are now saved in Brotli-compressed pickle ("Brickle") files, which requires less storage and allows more types of data to be saved than the previous Parquet and gzip-compressed CSV formats.

New features

  • The table step has been sped up via the sparse data format, and computing all fields is nearly as fast as computing one field. Thus table now computes all fields automatically (the option to compute select fields has been removed).

Bug fixes

  • When running align on demultiplexed FASTQ files, one report file is now generated for each FASTQ file, rather than all FASTQ files for each sample writing to and overwriting one report file.
  • When running relate on multiple samples that are aligned to the same set of references, every BAM/CRAM file from every sample is processed instead of only one sample BAM/CRAM file for each reference.
  • When running fold, misformatting of the RNAstructure Fold command has been fixed.

Internals

  • The core modules have been refactored into a group of subpackages, each with their own modules.
  • The all subcommand has been moved from the main.py module to its own subpackage.
  • The mutation calling and counting routines in the modules seismicrna.core.bitcall and seismicrna.core.bitvect, respectively, have been rewritten and replaced with seismicrna.core.rel.pattern and seismicrna.core.batch.accum.
  • The unique read finding algorithm has likewise been rewritten and moved to seismicrna.cluster.uniq.

Full Changelog: v0.8.0...v0.9.0

Pipe-A-line

06 Sep 16:50
Compare
Choose a tag to compare
Pipe-A-line Pre-release
Pre-release

What's new in 0.8.0?

  • Align now generates CRAM files with minimal headers instead of BAM files with full headers so that large FASTA files and large FASTQ files eventually require less storage space.
  • Align has been re-implemented as two shell pipelines instead of as a series of separate commands glued together with Python, to make it run faster and require less storage of temporary files.
  • A new function for parsing only the names of references in FASTA files (if the sequences are not needed) is based on grep and runs several times faster on large files than does the Python-based function for parsing both names and sequences.
  • The ambiguous nucleotide "N" is now supported in both reference and read sequences (previously, neither).
  • Unit tests have been updated to handle N in DNA and RNA sequences.

Full Changelog: v0.7.1...v0.8.0

Hatch Targets

01 Sep 15:17
Compare
Choose a tag to compare
Hatch Targets Pre-release
Pre-release

What's new in v0.7.1

  • Added Hatch targets to pyproject.toml
  • Updated documentation

Full Changelog: v0.7.0...v0.7.1

Dr. Docs

31 Aug 18:39
Compare
Choose a tag to compare
Dr. Docs Pre-release
Pre-release

What's new in 0.7.0

  • Output directories are now organized as out/sample/step/ref instead of out/step/sample/ref.
  • Documentation has been partially updated.
  • More unit tests have been added.
  • A release schedule has been added.
  • The --min-mapq option has been added.
  • Log messages are color coded.

Full Changelog: v0.6.2...v0.7.0

Bugfix for min_nmut_read

16 Aug 14:56
Compare
Choose a tag to compare
Pre-release

What's new in v0.6.2

  • Fixed bug with min_nmut_read not being accepted by main.run()

Full Changelog: v0.6.0...v0.6.2

Sliding correlations

14 Aug 23:28
Compare
Choose a tag to compare
Sliding correlations Pre-release
Pre-release

What's new in v0.6.0

  • Graph Pearson or Spearman correlations between two samples in sliding windows.
  • Fix bug involving missing arguments in struct module.