Skip to content

Commit

Permalink
deploy: eb7d525
Browse files Browse the repository at this point in the history
  • Loading branch information
toniher committed Dec 2, 2024
1 parent a13c5d3 commit b3caa32
Show file tree
Hide file tree
Showing 113 changed files with 8,518 additions and 0 deletions.
4 changes: 4 additions & 0 deletions refs/heads/MOP3/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file records the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 4dc3b1b5f2582bb3fc7bd1eb47eda37b
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file added refs/heads/MOP3/.doctrees/about.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/benchmark.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/changelog.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/ci.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/environment.pickle
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/index.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/install.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/mop_consensus.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/mop_mod.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/mop_preprocess.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/mop_tail.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/reporting.doctree
Binary file not shown.
Binary file added refs/heads/MOP3/.doctrees/troubleshooting.doctree
Binary file not shown.
Empty file added refs/heads/MOP3/.nojekyll
Empty file.
Binary file added refs/heads/MOP3/_images/epinano.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/flow_mod.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/flow_preproc.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/flow_tail.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/goku3.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/mod_corr.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/nanocons.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/res_report.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/tower.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/tower.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/tower2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added refs/heads/MOP3/_images/tower_eli1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
51 changes: 51 additions & 0 deletions refs/heads/MOP3/_sources/about.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
.. _home-page-about:

*******************
About Master of Pores 3
*******************

.. autosummary::
:toctree: generated


.. |docker| image:: https://img.shields.io/badge/Docker-v20.10.8-blue
.. |status| image:: https://github.com/biocorecrg/master_of_pores/actions/workflows/build.yml/badge.svg
.. |license| image:: https://img.shields.io/badge/License-MIT-yellow.svg
.. |nver| image:: https://img.shields.io/badge/Nextflow-21.04.1-brightgreen
.. |sing| image:: https://img.shields.io/badge/Singularity-v3.2.1-green.svg

.. list-table::
:widths: 10 10 10 10 10
:header-rows: 0

* - |docker|
- |status|
- |license|
- |nver|
- |sing|

`Master of Pores 3 <https://github.com/biocorecrg/master_of_pores>`_ is a collection of pipelines written in Nextflow DSL2 for the analysis of Nanopore data. It can handle reads from direct RNAseq, cDNAseq, DNAseq etc.

The software is composed by four pipelines:

- mop_preprocess: preprocessing of input data. Basecalling, demultiplexing, alignment, read counts, and more!
- mop_mod: detecting chemical modifications. It reads the output directly from mop_preprocess
- mop_tail: estimating polyA tail size. It reads the output directly from mop_preprocess
- mop_consensus: it generates a consensus from the predictions from mop_mod. It reads the output directly from mop_mod

The name is inspired by Metallica's `Master Of Puppets <https://www.youtube.com/watch?v=S7blkui3nQc>`_

.. image:: ../img/goku3.png
:width: 600

This is a joint project between `CRG bioinformatics core <https://biocore.crg.eu/>`_ and `Epitranscriptomics and RNA Dynamics research group <https://public-docs.crg.es/enovoa/public/website/index.html>`_.


Reference
======================

If you use this tool, please cite our papers:

`"Nanopore Direct RNA Sequencing Data Processing and Analysis Using MasterOfPores" <https://link.springer.com/protocol/10.1007/978-1-0716-2962-8_13>`__ Cozzuto L, Delgado-Tejedor A, Hermoso Pulido T, Novoa EM, Ponomarenko J. N. Methods Mol Biol. 2023;2624:185-205. doi: 10.1007/978-1-0716-2962-8_13.

`"MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets" <https://doi.org/10.3389/fgene.2020.00211](https://www.frontiersin.org/articles/10.3389/fgene.2020.00211/full>`_ Luca Cozzuto, Huanle Liu, Leszek P. Pryszcz, Toni Hermoso Pulido, Anna Delgado-Tejedor, Julia Ponomarenko, Eva Maria Novoa. Front. Genet., 17 March 2020.
35 changes: 35 additions & 0 deletions refs/heads/MOP3/_sources/benchmark.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
*******************
Benchmark
*******************

We tested MoP on two minION runs using the CRG's HPC where we can run up to 100 jobs in parallel (maximum 8 CPUs each) and using up to 10 GPU cards (GeForce RTX 2080 Ti). The test dataset was published at `ENA <https://www.ebi.ac.uk/>`_ with the accession `ERR5296640 <https://www.ebi.ac.uk/ena/browser/view/ERR5296640>`__ for pU samples and `ERR5303454 <https://www.ebi.ac.uk/ena/browser/view/ERR5303454>`__ for Nm samples.



.. list-table:: Dataset

* -
- MOP_PREPROCESS
- MOP_MOD
- MOP_TAIL
- MOP_CONSENSUS
* - Input data
- 95 Gb
- 137 Gb
- 137 Gb
- 14 Mb
* - Execution time
- 10 hours
- 6 hours
- 2.5 hours
- 3 mins
* - Work folder
- 382 Gb
- 595 Gb
- 3 Gb
- 25 Mb
* - Output folder
- 137 Gb
- 14 Mb
- 76 Mb
- 13 Mb
74 changes: 74 additions & 0 deletions refs/heads/MOP3/_sources/changelog.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
.. _home-page-changelog:

**************
CHANGELOG
**************

.. autosummary::
:toctree: generated

Version 3.0
================
* mop_preprocess
* We added a custom model for m6A basecalling. It is automatically installed when running INSTALL.sh. For using it you need to indicated ``--pars_tools "drna_tool_splice_m6A_opt.tsv" ``
* We add support to cuda11 for guppy version > 4.4.1.
* Added readucks for improving demultiplexing with guppy (optional).
* New parameter "barcodes" where you can specify a file with barcodes to be kept. Example in **keep_barcodes.txt**
* Adding a `new model for direct RNA basecalling <https://www.biorxiv.org/content/10.1101/2023.11.28.568965v1>`__.
* Added support to dorado basecalling. Not yet supported the demultiplexing
* Also guppy version >= 6.5.x are supported. No need for indicating different command lines for different guppy versions inside tool_opts. The pipeline will get the version and act accordingly
* pod5 are supported for dorado and guppy >= 6.5.x. No fast5 and stats files will be output. This will limit other pipelines.
* mop_tail
* we upgraded tailfindR to version 1.3
* Tailfinder can be used either in standard mode or nano3p mode (chemistry R10 and R9) by specifying the *tailfindr_mode* to: standard, n3ps_r9 or n3ps_r10.
Version 2.0
================
* Completely rewritten using the powerful `DSL2 <https://www.nextflow.io/docs/latest/dsl2.html>`__.
* Subworkflows are stored in the independent repository `BioNextflow <https://github.com/biocorecrg/BioNextflow>`__.
* Global nextflow config is broken down to different profiles (cluster, cloud, local...)
* Added the new module **mop_consensus**
* mop_preprocess (formerly known as nanoPreprocess + nanoPreprocessSimple)
* now can read multiple runs per time using the syntax **"PATH/\*\*/*.fast5"**
* can demultiplex fast5 using guppy too
* deeplexicon can be run on GPU too
* Parameters of each tool are stored in a tsv file. We have different ones already pre-set for cDNA, DNA and dRNA (option **--pars_tools**)
* Added new process **discovery** with bambu / isoquant for discovering and quantifying new transcripts.
* demultiplexing, filtering, mapping, counting and discovery can be switched off by setting "NO" as a parameter
* saveSpace can be set to "YES" to reduce the amount of disk space required. **WARNING This will prevent the possibility to resume!**
* Merged old NanoPreprocess and NanoPreprocessSimple in **mop_preprocess**. Using fastq or fast5 will switch among the two executions.
* Htseq-count now accepts alignments generated by minimap2. https://github.com/htseq/htseq/issues/33
* We can specify a **final_summary_**.txt** for extracting kit and flowcell info in the params.config file. If not present we should specify those info or a custom model via extra parameters in one of the **\*_opt.tsv** files or guppy will trigger an error.
* This module can be run in AWS BATCH using the profile **awsbatch**
* demultiplexing of fast5 with deeplexicon is now faster thanks to multithreading and parallelization
* mop_tail (formerly known as nanoTail)
* now you can launch each analysis independently
* Fine tuning of parameter for each step in tools_opt.tsv
* mop_mod (formerly known as nanoMod)
* coming SOON!
Version 1.1
=================
* Added a new module called NanoPreprocessSimple that starts from fastq files instead of fast5 files. It allows the analysis of multiple files at a time.
* Added support to vbz compressed fast5 https://github.com/nanoporetech/vbz_compression in NanoPreprocess, NanoMod and NanoTail
* NanoPreprocess now outputs also CRAM files and can do downsampling with the parameter --downsampling
* NanoPreprocess allows performing variant calling using medaka (BETA)
* NanoPreprocess allows performing demultiplexing with GUPPY
* Added plots for Epinano output in NanoMod
* Added a conversion of Tombo results in bed format in NanoMod
* Added a INSTALL.sh file for automatically retrieve guppy 3.4.5 from https://mirror.oxfordnanoportal.com/, place it in NanoPreprocess/bin and making the required links
* Added profiles for being used locally and on the CRG SGE cluster
Version 1.0
================
This is the original version published in the paper `MasterOfPores: A Workflow for the Analysis of Oxford Nanopore Direct RNA Sequencing Datasets <https://www.frontiersin.org/articles/10.3389/fgene.2020.00211/full>`__
18 changes: 18 additions & 0 deletions refs/heads/MOP3/_sources/ci.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.. _home-page-about:

*******************
Continuous integration
*******************

.. autosummary::
:toctree: generated

The following pipelines are continuously checked using GitHub actions:

* mop_preprocess
* mop_mod
* mop_tail

.. image:: https://github.com/biocorecrg/master_of_pores/actions/workflows/build.yml/badge.svg
:target: https://github.com/biocorecrg/master_of_pores
:alt: pipeline status
43 changes: 43 additions & 0 deletions refs/heads/MOP3/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
.. _home-page-index:

*******************
Welcome to the documentation of Master Of Pores 3
*******************


.. autosummary::
:toctree: generated

.. image:: ../img/goku3.png
:width: 600

Master of Pores is a pipeline written in Nextflow DSL2 for the analysis of Nanopore data. It can handle reads from direct RNAseq, cDNAseq, DNAseq etc.

The pipeline is composed by four modules:
- mop_preprocess: preprocessing
- mop_mod: detecting chemical modifications. It reads the output directly from mop_preprocess
- mop_tail: estimating polyA tail size. It reads the output directly from mop_preprocess
- mop_consensus: it generates a consensus from the predictions from mop_mod. It reads the output directly from mop_mod

.. MoP3 documentation master file, created by
Luca Cozzuto.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Contents:

.. toctree::
:maxdepth: 1

about
install
mop_preprocess
mop_mod
mop_consensus
mop_tail
reporting
awsbatch
benchmark
changelog
ci
troubleshooting
72 changes: 72 additions & 0 deletions refs/heads/MOP3/_sources/install.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
.. _home-page-install:

**************
Get Started
**************

.. autosummary::
:toctree: generated

Please install nextflow `Nextflow <https://www.nextflow.io/>`_ and either `Singularity <https://sylabs.io/>`_ or `Docker <https://www.docker.com/>`_ before.

For installing Nextflow you need a POSIX compatible system (Linux, OS X, etc). It requires Bash 3.2 (or later) and Java 11 (or later, up to 17). Windows system is supported through WSL. For the installation of Nextflow just run:

.. code-block:: console
curl -s https://get.nextflow.io | bash
To install the pipeline you need to download the repo:

.. code-block:: console
git clone --depth 1 --recurse-submodules https://github.com/biocorecrg/master_of_pores.git
Installing Guppy
============

You can use **INSTALL.sh** and the version of Guppy you want to download.

.. note::

Please consider that the support of VBZ compression of fast5 started with version 3.4.X.


.. code-block:: console
cd master_of_pores; bash INSTALL.sh 6.0.1
or for installing the default 3.4.5

.. code-block:: console
cd master_of_pores; bash INSTALL.sh
Guppy custom models for RNA basecalling will be downloaded from our repository https://biocore.crg.eu/public/mop3_pub/models.tar and placed automatically within the right path inside the pipeline.

You can install different versions of Guppy but only one will be run during the pipeline execution. For switching among them you need to run INSTALL.sh with the version you prefer.

Testing
============

.. code-block:: console
cd mop_preprocess
nextflow run mop_preprocess.nf -params-file params.f5.yaml -with-singularity -bg -profile local > log
.. tip::

You can replace ```-with-singularity``` with ```-with-docker``` if you want to use the docker engine.

Profiles
============
Some nextflow configuration files are stored within the folder **conf** and can be selected using different profiles. Currently, we have:

- ci: for continuous integration testing (low resources)
- local: for being used in a laptop without GPU support
- m1mac: for running the containers in emulation for being used on M1/M2/M3 Apple processors.
- sge: for being used in an HPC with Sun Grid Engine
- cluster or crg: for being used in the custom HPC environment at CRG
- slurm: for being used in an HPC with SLURM
- awsbatch: for being used in Amazon AWS cloud infrastructure

63 changes: 63 additions & 0 deletions refs/heads/MOP3/_sources/mop_consensus.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
.. _home-page-mopconsensus:

*******************
MOP_CONSENSUS
*******************

.. autosummary::
:toctree: generated

This pipeline takes as input the output from MOP_MOD with all the four worklows. It outputs the consensus of the diferent predictions running the tool `Nanoconsensus <https://github.com/ADelgadoT/NanoConsensus>`__ in parallel on each transcript for each comparison.


Input Parameters
======================

The input parameters are stored in yaml files like the one represented here:

.. literalinclude:: ../mop_consensus/params.yaml
:language: yaml


How to run the pipeline
=============================

Before launching the pipeline,user should:

1. Decide which containers to use - either docker or singularity **[-with-docker / -with-singularity]**.
2. Fill in both **params.config** and **tools_opt.tsv** files.

To launch the pipeline, please use the following command:

.. code-block:: console
nextflow run mop_consensus.nf -params-file params.yaml -with-singularity > log.txt
You can run the pipeline in the background adding the nextflow parameter **-bg**:

.. code-block:: console
nextflow run mop_consensus.nf -params-file params.yaml -with-singularity -bg > log.txt
You can change the parameters either by changing **params.config** file or by feeding the parameters via command line:

.. code-block:: console
nextflow run mop_consensus.nf -params-file params.yaml -with-singularity -bg --output test2 > log.txt
You can specify a different working directory with temporary files:

.. code-block:: console
nextflow run mop_consensus.nf -params-file params.yaml -with-singularity -bg -w /path/working_directory > log.txt
Results
====================

Here an example of a result:

.. image:: ../img/nanocons.png
:width: 800
Loading

0 comments on commit b3caa32

Please sign in to comment.