Skip to content

Releases: pmelsted/pizzly

Bug fixes and license switch

20 Jul 21:40
Compare
Choose a tag to compare

We recommend that users switch to this release because of a few bug fixes.

  • Wrong strand reported when fusions are only supported by pairs and no split reads
  • Bug with negative coordinates fixed
  • Bug fixed when reference sequences had lower case characters
  • Correct order of geneA and geneB in JSON output

Additionally we now have a scripts folder with useful python scripts

  • get_fragment_length.py examines an abundance.h5 produced by kallisto and finds the 95th percentile of the fragment length distribution
  • flatten_json.py reads the .json output and converts to a simple gene table

License was switched from GPL to BSD-2

Better GTF support

28 Jun 11:03
Compare
Choose a tag to compare

GTF

This version includes bug fixes that improve GTF parsing. We now support the Ensembl and Gencode annotations and have been tested with the latest versions.

Note that for Gencode the FASTA files must be modified so that they match the GTF files (Gencode fasta uses pipes, |, as a separator in the FASTA sequence names, rather than a space). This can be fixed by running

zcat gencode.v26.transcripts.fa.gz  | tr '|' ' ' | gzip -1 >  gencode.v26.transcripts.fixed.fa.gz

Protein coding annotation

pizzly limits the fusion reports to transcripts that have been annotated as protein coding. If this information is not present in the annotation, the --ignore-protein option ignores this requirement. Running pizzly in this way will most likely increase the number of false positives reported.

Warnings

pizzly will now warn when there are sequences in the FASTA file with no corresponding annotation and exit if no sequences have available annotation. pizzly also warns if no transcripts are annotated as protein coding.

Better filtering

12 Apr 21:38
Compare
Choose a tag to compare

Filtering

Pizzly now outputs filtered and unfiltered fusion calls.

Pizzly filters on

  • number of supporting reads
  • distance of fusion breakpoint to exon boundaries
  • for fusions with unknown breakpoints, only read pairs that observe the maximum fragment length are included

Example

An example pipeline based on data from Tembe et al., Open-access synthetic spike-in mRNA-seq data for cancer gene fusions.

The example pipeline is implemented in snakemake.

First release

21 Mar 20:54
Compare
Choose a tag to compare

First release of pizzly.