Skip to content

Commit

Permalink
Rework the contributor guide
Browse files Browse the repository at this point in the history
  • Loading branch information
padix-key committed Apr 30, 2024
1 parent 30a6bb4 commit 48b41ed
Show file tree
Hide file tree
Showing 9 changed files with 579 additions and 433 deletions.
430 changes: 0 additions & 430 deletions doc/contribute.rst

This file was deleted.

79 changes: 79 additions & 0 deletions doc/contribution/deployment.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
Deployment of a new release
===========================
This section describes how create and deploy a release build of the *Biotite*
package and documentation.
Therefore, this section primarily addresses the maintainers of the project.

CCD update
----------
:mod:`biotite.structure.info` bundles selected information from the
`Chemical Component Dictionary <https://www.wwpdb.org/data/ccd>`_ (CCD).
From time to time, this dataset needs an update to include new components
added to the CCD.
This is achieved by running ``setup_ccd.py``.

To keep the size of the repository small, the original commit from the initial
script run should be rewritten, if the formats of the affected files are
compatible with the original ones.

Version bump
------------
A version bump requires changes in multiple locations:

- ``src/biotite/__init__.py``: The main source of the version number.
- ``doc/static/switcher.json``: The current version needs to be added and set
as the preferred one. This allows the documentation website to switch between
different versions of the documentation.

The version bump is conducted by running the ``bump_version.yml`` CI job.
It can be triggered via the GitHub Action ``Bump version``.
This action creates a new pull request with the required changes.

Creating a new release
----------------------
When a new *GitHub* release is created, the CI jobs building the distributions
and documentation in ``test_and_deploy.yml`` are triggered.
After the successful completion of these jobs, the artifacts are added to the
release.
The distributions for different platforms and Python versions are automatically
uploaded to *PyPI*.

Conda release
-------------
Some time after the release on GitHub, the ``conda-forge`` bot will also create
an automatic pull request for the new release of the
`Conda package <https://github.com/conda-forge/biotite-feedstock>`_.
If no dependencies changed, this pull request can usually be merged without
further effort.

Documentation website
---------------------
The final step of the deployment is putting the directory containing the built
documentation onto the server hosting the website.

The document root of the website should look like this:

.. code-block::
├─ .htaccess
├─ latest -> x.y.z/
├─ x.y.z/
│ ├─ index.html
│ ├─ ...
├─ a.b.c/
├─ index.html
├─ ...
``x.y.z/`` and ``a.b.c/`` represent the documentation directories for two
different *Biotite* release versions.

``.htaccess`` should have the following content:

.. code-block:: apache
RewriteBase /
RewriteEngine On
# Redirect if page name does not start with 'latest' or version identifier
RewriteRule ^(?!latest|\d+\.\d+\.\d+|robots.txt)(.*) latest/$1 [R=301,L]
ErrorDocument 404 /latest/404.html
187 changes: 187 additions & 0 deletions doc/contribution/development.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
Writing source code
===================

Scope
-----
The scope of *Biotite* are methods that make up the backbone of
computational molecular biology. Thus, new functionalities added to
*Biotite* should be relatively general and well established.

Code of which the purpose is too special could be published as
:ref:`extension package <extension_packages>` instead.

Consistency
-----------
New functionalities should act on the existing central classes, if applicable
to keep the code as uniform as possible.
Specifically, these include

- :class:`biotite.structure.AtomArray`,
- :class:`biotite.structure.AtomArrayStack`,
- :class:`biotite.structure.BondList`,
- :class:`biotite.sequence.Sequence` and its subclasses,
- :class:`biotite.sequence.Alphabet`,
- :class:`biotite.sequence.Annotation`,
including :class:`biotite.sequence.Feature`
and :class:`biotite.sequence.Location`,
- :class:`biotite.sequence.AnnotatedSequence`,
- :class:`biotite.sequence.Profile`,
- :class:`biotite.sequence.align.Alignment`,
- :class:`biotite.application.Application` and its subclasses,
- and in general :class:`numpy.ndarray`.

If you think that the currently available classes miss a central *object*
in bioinformatics, you might consider opening an issue on *GitHub* or reach
out to the maintainers.

Small *helper classes* for a functionality (for example an :class:`Enum` for a
function parameter) is also permitted, as long as it does not introduce a
redundancy with the classes mentioned above.

Python version and interpreter
------------------------------
The package supports the three most recent versions of Python.
In consequence, language features that were introduced after the oldest
supported Python version are not allowed.

This time span balances the support for older Python versions as well as
the ability to use more recent features of the programming language.
Furthermore, this package is currently made for usage with CPython.
Official support for PyPy might be added someday.

Code style
----------
*Biotite* is in compliance with PEP 8.
The maximum line length is 79 for code lines and 72 for docstring and
comment lines.
An exception is made for docstring lines, if it is not possible to use a
maximum of 72 characters (e.g. tables), and for
`doctest <https://docs.python.org/3/library/doctest.html>`_ lines,
where the actual code may take up to 79 characters.

Dependencies
------------
*Biotite* aims to rely only on a few dependencies to keep the installation
small.
However optional dependencies for a specific dependency are also allowed if
necessary.
In this case add your special dependency to the list of extra
requirements in ``install.rst``.
The import statement for the dependency should be located directly inside the
function or class, rather than module level, to ensure that the package is not
required for any other functionality or for building the API documentation.

An example for this approach is the support for trajectory files in
:mod:`biotite.structure.io`, that require `MDTraj <http://mdtraj.org/>`_.
The usage of these packages is not only allowed but even encouraged.

Code efficiency
---------------
The central aims of *Biotite* are that it is both, convenient and fast.
Therefore, the code should be vectorized as much as possible using *NumPy*.
In cases the problem cannot be reasonably or conveniently solved this way,
writing modules in `Cython <https://cython.readthedocs.io/en/latest/>`_ is the
preferred way to go.
Writing extensions directly in C/C++ is discouraged due to the bad readability.
Writing extensions in other programming languages
(e.g. in *Rust* via `PyO3 <https://pyo3.rs>`_) is currently not permitted to
keep the build process simple.

Docstrings
----------
*Biotite* uses
`numpydoc <https://numpydoc.readthedocs.io/en/latest/format.html>`_
formatted docstrings for its documentation.
These docstrings can be interpreted by *Sphinx* via the ``numpydoc`` extension.
All publicly accessible attributes must be fully documented.
This includes functions, classes, methods, instance and class variables and the
``__init__`` modules:

The ``__init__`` module documentation summarizes the content of the entire
subpackage, since the single modules are not visible to the user.
In the class docstring, the class itself is described and the constructor is
documented.
The publicly accessible instance variables are documented under the
`Attributes` headline, while class variables are documented in their separate
docstrings.
Methods do not need to be summarized in the class docstring.

Module imports
--------------
In *Biotite*, the user imports packages in contrast to single modules
(similar to *NumPy*).
In order for that to work, the ``__init__.py`` file of each *Biotite*
subpackage needs to import all of its modules, whose content is publicly
accessible, in a relative manner.

.. code-block:: python
from .module1 import *
from .module2 import *
Import statements should be the only statements in a ``__init__.py`` file.

In case a module needs functionality from another subpackage of *Biotite*,
use a relative import.
This import should target the module directly and not the package to avoid
circular imports and thus an ``ImportError``.
So import statements like the following are totally OK:

.. code-block:: python
from ...package.subpackage.module import foo
In order to prevent namespace pollution, all modules must define the `__all__`
variable with all publicly accessible attributes of the module.

Versioning
----------
Biotite adopts `Semantic Versioning <https://semver.org>`_ for its releases.
This means that the version number is composed of three parts:

- Major version: Incremented when incompatible API changes are made.
- Minor version: Incremented when a new functionality is added in a backwards
compatible manner.
- Patch version: Incremented when backwards compatible bug fixes are made.

Note, that such backwards incompatible changes in minor/patch versions are only
disallowed regarding the *public API*.
This means that names and types of parameters and the type of the return value
must not be changed in any function/class documented in the API reference.
However, behavioral changes (especially small ones) are allowed.

Although minor versions may not remove existing functionalities, they can
deprecate them by

- marking them as deprecated via a notice in the docstring and
- raising a `DeprecationWarning` when a deprecated functionality is used.

This gives the user a heads-up that the functionality will be removed soon.
In the next major version, deprecated functionalities can be removed entirely.

.. _extension_packages:

Extension packages
------------------
*Biotite* extension packages are Python packages that provide further
functionality for *Biotite* objects (:class:`AtomArray`, :class:`Sequence`,
etc.)
or offer objects that build up on these ones.

There can be good reasons why one could choose to publish code as extension
package instead of contributing it directly to the *Biotite* project:

- Independent development
- An incompatible license
- The code's use cases are too specialized
- Unsuitable dependencies
- Extensions written in a non-permitted programming language

If your code fulfills the following conditions

- extends *Biotite* functionality
- is documented
- is well tested

you can open an issue to ask for addition of the package to the
:doc:`extension package page <../extensions>`.
Loading

0 comments on commit 48b41ed

Please sign in to comment.