15 Nov 13:42

matthewfallan

7866deb

Cluster Clones Pre-release

Pre-release

What's new in v0.9.4

Bugfixes

Prior to this release, on some but not all platforms, running multiple EM clustering runs in parallel (but not in series) would cause them to have identical trajectories. I suspect (but have not demonstrated) that this bug happened because on such platforms, the entire Python state (including the random number generator) was copied to each subprocess by concurrent.futures.ProcessPoolExecutor. Further suggesting this mechanism, on the same platforms, subprocesses also log messages the same way as the main process, suggesting that the root logger is also copied. So that each clustering run has a unique trajectory even with the same Python state, each clustering run now accepts a seed for its random number generator, which is randomized by the main process to ensure that each clustering run is seeded uniquely.

Internals

Adopt the convention where all strings use double quotation marks.
Add more detailed __str__ methods to some custom classes.

Documentation

Add more information to the documentation, especially to Manuals -> Workflow.

Full Changelog: v0.9.3...v0.9.4

Assets 2

10 Nov 00:39

matthewfallan

v0.9.3

a8f7d6f

WebApp Export Pre-release

Pre-release

What's new in 0.9.3

New Features

Export mutational and structural data and metadata for the web app using seismic export web [-S sample-metadata.csv] [-R reference-metadata.csv] [samples ...]

Bugfixes

BAM/CRAM files with insufficient reads are no longer returned in the list of output files from the align step.

Full Changelog: v0.9.2...v0.9.3

Assets 2

09 Nov 00:44

matthewfallan

v0.9.2

f92b1e1

Memory Managed Pre-release

Pre-release

What's new in v0.9.2

Bugfixes

Serious memory leaks caused by cached instance methods of the MutsBatch class (since v0.9.0) have been fixed (credit to @justinaruda).

Internals

A new Header class has been introduced to handle parsing and formatting table headers for types of relationships and/or clusters, along with a test suite for it.
Both branches for demultiplexing (credit to @heWhosShouldersBlockTheSun) have been merged into the main branch.

What's Changed

Patched several memory leaks in the mask module. by @justinaruda in #2
Demulti reset by @matthewfallan in #3
Demult fixed by @matthewfallan in #4

New Contributors

@justinaruda made their first contribution in #2
@matthewfallan made their first contribution in #3

Full Changelog: v0.9.1...v0.9.2

Contributors

matthewfallan, justinaruda, and heWhosShouldersBlockTheSun

Assets 2

25 Oct 00:13

matthewfallan

v0.9.1

794bbde

Underworld of the Unsigned Pre-release

Pre-release

What's new in 0.9.1

Bug fixes

Fixed a bug in seismicrna.core.batch.index: calling np.full(target.max(initial=-1) + 1, -1) where target is a NumPy NDArray of unsigned integer type would implicitly convert -1 to an unsigned integer with the maximum value of its data type (e.g. 4,294,967,295 for a 32-bit integer) and attempt to allocate memory for an enormous array of this size. On some systems, this would cause a crash, and on others would simply waste time allocating the memory.

Assets 2

24 Oct 20:46

matthewfallan

v0.9.0

37eeb0c

Speed by Sparsity Pre-release

Pre-release

What's new in 0.9.0

Performance upgrades

Mutation data is now processed and saved in a sparse format that tracks only the mutated positions. Since mutations make up only 1 - 5% of most datasets, the sparse format is more storage-, memory-, and time-efficient.
Batches are now saved in Brotli-compressed pickle ("Brickle") files, which requires less storage and allows more types of data to be saved than the previous Parquet and gzip-compressed CSV formats.

New features

The table step has been sped up via the sparse data format, and computing all fields is nearly as fast as computing one field. Thus table now computes all fields automatically (the option to compute select fields has been removed).

Bug fixes

When running align on demultiplexed FASTQ files, one report file is now generated for each FASTQ file, rather than all FASTQ files for each sample writing to and overwriting one report file.
When running relate on multiple samples that are aligned to the same set of references, every BAM/CRAM file from every sample is processed instead of only one sample BAM/CRAM file for each reference.
When running fold, misformatting of the RNAstructure Fold command has been fixed.

Internals

The core modules have been refactored into a group of subpackages, each with their own modules.
The all subcommand has been moved from the main.py module to its own subpackage.
The mutation calling and counting routines in the modules seismicrna.core.bitcall and seismicrna.core.bitvect, respectively, have been rewritten and replaced with seismicrna.core.rel.pattern and seismicrna.core.batch.accum.
The unique read finding algorithm has likewise been rewritten and moved to seismicrna.cluster.uniq.

Full Changelog: v0.8.0...v0.9.0

Assets 2

06 Sep 16:50

matthewfallan

v0.8.0

9bdece5

Pipe-A-line Pre-release

Pre-release

What's new in 0.8.0?

Align now generates CRAM files with minimal headers instead of BAM files with full headers so that large FASTA files and large FASTQ files eventually require less storage space.
Align has been re-implemented as two shell pipelines instead of as a series of separate commands glued together with Python, to make it run faster and require less storage of temporary files.
A new function for parsing only the names of references in FASTA files (if the sequences are not needed) is based on grep and runs several times faster on large files than does the Python-based function for parsing both names and sequences.
The ambiguous nucleotide "N" is now supported in both reference and read sequences (previously, neither).
Unit tests have been updated to handle N in DNA and RNA sequences.

Full Changelog: v0.7.1...v0.8.0

Assets 2

01 Sep 15:17

matthewfallan

v0.7.1

ddaff9c

Hatch Targets Pre-release

Pre-release

What's new in v0.7.1

Added Hatch targets to pyproject.toml
Updated documentation

Full Changelog: v0.7.0...v0.7.1

Assets 2

31 Aug 18:39

matthewfallan

v0.7.0

b1ca4ef

Dr. Docs Pre-release

Pre-release

What's new in 0.7.0

Output directories are now organized as out/sample/step/ref instead of out/step/sample/ref.
Documentation has been partially updated.
More unit tests have been added.
A release schedule has been added.
The --min-mapq option has been added.
Log messages are color coded.

Full Changelog: v0.6.2...v0.7.0

Assets 2

16 Aug 14:56

matthewfallan

v0.6.2

0e16440

Bugfix for min_nmut_read Pre-release

Pre-release

What's new in v0.6.2

Fixed bug with min_nmut_read not being accepted by main.run()

Full Changelog: v0.6.0...v0.6.2

Assets 2

14 Aug 23:28

matthewfallan

v0.6.0

2e485d0

Sliding correlations Pre-release

Pre-release

What's new in v0.6.0

Graph Pearson or Spearman correlations between two samples in sliding windows.
Fix bug involving missing arguments in struct module.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's new in v0.9.4

Bugfixes

Internals

Documentation

What's new in 0.9.3

New Features

Bugfixes

What's new in v0.9.2

Bugfixes

Internals

What's Changed

New Contributors

Contributors

What's new in 0.9.1

Bug fixes

What's new in 0.9.0

Performance upgrades

New features

Bug fixes

Internals

What's new in 0.8.0?

What's new in v0.7.1

What's new in 0.7.0

What's new in v0.6.2

Releases: rouskinlab/seismic-rna

Cluster Clones

What's new in v0.9.4

Bugfixes

Internals

Documentation

WebApp Export

What's new in 0.9.3

New Features

Bugfixes

Memory Managed

What's new in v0.9.2

Bugfixes

Internals

What's Changed

New Contributors

Contributors

Underworld of the Unsigned

What's new in 0.9.1

Bug fixes

Speed by Sparsity

What's new in 0.9.0

Performance upgrades

New features

Bug fixes

Internals

Pipe-A-line

What's new in 0.8.0?

Hatch Targets

What's new in v0.7.1

Dr. Docs

What's new in 0.7.0

Bugfix for min_nmut_read

What's new in v0.6.2

Sliding correlations