Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
briney committed Oct 19, 2024
1 parent 42b8ce7 commit b111618
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 5 deletions.
24 changes: 23 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,25 @@ appropriate for each level of granularity.

|
input/output (``abutils.io``)
--------------------------------

To simplify data manipulation and facilitate the integration of ``abutils`` into
existing pipelines, ``abutils`` provides a set of functions for reading and writing
sequence data:

* :ref:`read <read-sequences>`: read sequences from a variety of file formats, including FASTA, FASTQ, AIRR-C, and others.

* :ref:`write <write-sequences>`: write sequences to a variety of file formats, including FASTA, FASTQ, AIRR-C, and others.

* :ref:`convert <convert-sequences>`: convert between ``Sequence`` or ``Pair`` objects and Pandas_ or Polars_ DataFrames.

* :ref:`paths <path>`: functions for working with file paths and directories.

All of the IO functions are accessible via ``abutils.io``.

|
tools (``abutils.tl``)
--------------------

Expand All @@ -47,7 +66,7 @@ into custom pipelines or for use when performing interactive analyses:

* :ref:`pairwise alignment <pairwise-alignment>`: local (Smith-Waterman), global (Needleman-Wunsch) and semi-global pairwise sequence alignment using parasail_.

* :ref:`multiple sequence alignment <multiple-sequence-alignment>` using MAFFT_ or MUSCLE_
* :ref:`multiple sequence alignment <multiple-sequence-alignment>` using MAFFT_, MUSCLE_, or FAMSA_

* :ref:`clustering <clustering>`: identity-based sequence clustering with VSEARCH_, CDHIT_, or MMseqs2_

Expand Down Expand Up @@ -97,12 +116,15 @@ multiprocessing jobs, creating and modifying color palettes, and others.
.. _parasail: https://github.com/jeffdaily/parasail-python
.. _MAFFT: https://mafft.cbrc.jp/alignment/software/
.. _MUSCLE: https://www.drive5.com/muscle/
.. _FAMSA: https://github.com/MikkelSchubert/FAMSA
.. _VSEARCH: https://github.com/torognes/vsearch
.. _CDHIT: http://weizhongli-lab.org/cd-hit/
.. _MMseqs2: https://github.com/soedinglab/MMseqs2
.. _FastTree: http://www.microbesonline.org/fasttree/
.. _IgPhyML: https://github.com/kbhoehn/IgPhyML
.. _baltic: https://github.com/evogytis/baltic
.. _Pandas: https://pandas.pydata.org/
.. _Polars: https://pola.rs/



Expand Down
8 changes: 4 additions & 4 deletions docs/source/tools/clustering.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ clustering algorithm is desired.

``abutils.tl.cluster`` can accept a variety of inputs, including:

- a path to a FASTA file
- a FASTA-formatted string
- a list of ``abutils.Sequence`` objects
- a list of anything accepted by :class:`abutils.Sequence`
* a path to a FASTA file
* a FASTA-formatted string
* a list of ``abutils.Sequence`` objects
* a list of anything accepted by :class:`abutils.Sequence`

The ``threshold`` argument is the sequence identity threshold for clustering, and should be between 0.0 and 1.0.

Expand Down

0 comments on commit b111618

Please sign in to comment.