Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revised WMLCE + Open-CE documentation. #102

Merged
merged 3 commits into from
Apr 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 41 additions & 3 deletions software/applications/conda.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ Conda

`Conda <https://docs.conda.io/>`__ is an open source package management system and environment management system that runs on Windows, macOS and Linux. Conda quickly installs, runs and updates packages and their dependencies.


.. _software-applications-conda-installing:

Installing Miniconda
~~~~~~~~~~~~~~~~~~~~

Expand All @@ -26,15 +29,15 @@ The simplest way to install Conda for use on Bede is through the `miniconda <htt
sha256sum Miniconda3-latest-Linux-ppc64le.sh

sh Miniconda3-latest-Linux-ppc64le.sh -b -p ./miniconda
source miniconda/bin/activate
source miniconda/etc/profile.d/conda.sh
conda update conda -y

On subsequent sessions, or in job scripts you may need to re-source miniconda. Alternatively you could add this to your bash environment. I.e.

.. code-block:: bash

export CONDADIR=/nobackup/projects/<project>/$USER # Update this with your <project> code.
source $CONDADIR/miniconda/bin/activate
source $CONDADIR/miniconda/etc/profile.d/conda.sh

Creating a new Conda Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -45,14 +48,28 @@ I.e. to create a new conda environment named `example`, with `python 3.9` you ca

.. code-block:: bash

conda create -y --name example python==3.9
conda create -y --name example python=3.9

Once created, the environment can be activated using ``conda activate``.

.. code-block:: bash

conda activate example

Alternatively, Conda environments can be created outside of the conda/miniconda install, using the ``-p`` / ``--prefix`` option of ``conda create``.

I.e. if you have installed miniconda to your home directory, but wish to create a conda environment within the ``/project/<PROJECT>/$USER/`` directory named ``example`` you can use:

.. code-block:: bash

conda create -y --prefix /project/<PROJECT>/$USER/example python=3.9

This can subsequently be loaded via:

.. code-block:: bash

conda activate /project/<PROJECT>/$USER/example

Listing and Activating existing Conda Environments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand All @@ -64,6 +81,27 @@ Existing conda environments can be listed via:

``conda activate`` can then be used to activate one of the listed environments.

Adding Conda Channels to an Environment
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The default conda channel does not contain all packages or may not contain versions of packages you may wish to use.

In this case, third-party conda channels can be added to conda environments to provide access to these packages, such as the :ref:`Open-CE <software-applications-open-ce>` Conda channel hosted by Oregon State University.

It is recommended to add channels to specific conda environments, rather than your global conda configuration.

I.e. to add the `OSU Open-CE Conda channel <https://osuosl.org/services/powerdev/opence/>`__ to the currently loaded conda environment:

.. code-block:: bash

conda config --env --prepend channels https://ftp.osuosl.org/pub/open-ce/current/

You may also wish to enable `strict channel priority <https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-channels.html#strict-channel-priority>`__ to speed up conda operations and reduce incompatibility which will be default from Conda 5.0. This may break old environment files.

.. code-block:: bash

conda config --env --set channel_priority strict

Installing Conda Packages
~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
117 changes: 117 additions & 0 deletions software/applications/open-ce.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
.. _software-applications-open-ce:

Open-CE
=======

The `Open Cognitive Environment (Open-CE) <https://osuosl.org/services/powerdev/opence/>`__ is a community driven software distribution for machine learning and deep learning frameworks.

Open-CE software is distributed via :ref:`Conda<software-applications-conda>`, with all included packages for a given Open-CE release being installable in to the same conda environment.

Open-CE conda channels suitable for use on Bede's IBM Power architecture systems are hosted by `Oregon State University <https://osuosl.org/services/powerdev/opence/>`__ and `MIT <https://opence.mit.edu/>`__.

It is the successor to :ref:`IBM WMLCE <software-applications-wmlce>` which was archived on 2020-11-10, with IBM WMLCE 1.7.0 being the final release.

Open-CE includes the following software packages, amongst others:

* :ref:`TensorFlow <software-applications-tensorflow>`
* :ref:`PyTorch <software-applications-pytorch>`
* `Horovod <https://horovod.ai/>`__
* `ONNX <https://onnx.ai/>`__

.. note::

Open-CE does not include all features from WMLCE, such as Large Model Support or Distributed Deep Learning (DDL).

Using Open-CE
-------------

Open-CE provides software packages via :ref:`Conda<software-applications-conda>`, which you must first :ref:`install<software-applications-conda-installing>`.
Conda installations of the packages provided by Open-CE can become quite large (multiple GBs), so you may wish to use a conda installation in ``/nobackup/projects/<project>`` or ``/projects/<project>`` as described in the :ref:`Installing Conda section <software-applications-conda-installing>`.

With a working Conda install, Open-CE packages can be installed from either the OSU or MIT Conda channels for PPC64LE systems such as Bede.

* OSU: ``https://ftp.osuosl.org/pub/open-ce/current/``
* MIT: ``https://opence.mit.edu/``

Using Conda Environments are recommended when working with Open-CE.

I.e. to install ``tensorflow`` and ``pytorch`` from OSU Open-CE conda channel into a conda environment named ``open-ce``:

.. code-block:: bash

# Create a new conda environment named open-ce within your conda installation
conda create -y --name open-ce python=3.9 # Older Open-CE may require older Python versions

# Activate the conda environment
conda activate open-ce

# Add the OSU Open-CE conda channel to the current environment config
conda config --env --prepend channels https://ftp.osuosl.org/pub/open-ce/current/
# Also use strict channel priority
conda config --env --set channel_priority strict

# Install the required conda package, using the channels set within the conda env. This may take some time.
conda install -y tensorflow
conda install -y pytorch

Once installed into a conda environment, the Open-CE provided software packages can be used interactively on login nodes or within batch jobs by activating the named conda environment.

.. code-block:: bash

# Activate the conda environment
conda activate open-ce

# Run a python command or script which makes use of the installed packages
# I.e. to output the version of tensorflow:
python3 -c "import tensorflow;print(tensorflow.__version__)"

# I.e. or to output the version of pytorch:
python3 -c "import torch;print(torch.__version__)"

Using older versions of Open-CE
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The OSU conda distribution provides an archive of older Open-CE releases, beginning at version ``1.0.0``.

The available versions are listed at https://ftp.osuosl.org/pub/open-ce/.

Using versions other than ``current`` can be done by modifying the channel URI when adding the channel to the current conda environment with the desired version number.

I.e. to explicitly use Open-CE ``1.4.1`` the command to add the conda channel to the current environment would be:

.. code-block:: bash

conda config --env --prepend channels https://ftp.osuosl.org/pub/open-ce/1.4.1/

Using older Open-CE versions may require older python versions.
See the `OSU Open-CE page <https://osuosl.org/services/powerdev/opence/>`__ for further version information.

The MIT Open-CE channel provides multiple versions of Open-CE in the same Conda channel. If using the MIT Open-CE distribution, older versions of packages can be requested by specifying the specific version of the desired package.

Why use Open-CE
---------------

Modern machine learning packages like TensorFlow and PyTorch have large dependency trees which can conflict with one another due to the independent release schedules.
This has made it difficult to use multiple competing packages within the same environment.

Open-CE solves this issue by ensuring that packages included in a given Open-CE distribution are compatible with one another, and can be installed a the same time, simplifying the distribution of these packages.

It also provides pre-compiled distributions of these packages for PPC64LE architecture machines, which are not always available from upstream sources, reducing the time required to install these packages.

For more information on the potential benefits of using Open-CE see `this blog post from the OpenPOWER foundation <https://openpowerfoundation.org/blog/open-cognitive-environment-open-ce-a-valuable-tool-for-ai-researchers/>`__.

Differences from WMLCE
----------------------

:ref:`IBM WMLCE<software-applications-wmlce>` include several features not available in upstream TensorFlow and PyTorch distributions, such as Large Model Support.

Unfortunately, LMS is not available in TensorFlow or PyTorch provided by Open-CE.

Other features or packages absent in Open-CE which were included in WMLCE include:

* Large Model Support (LMS)
* IBM DDL
* Caffe (IMB-enhanced)
* IBM SnapML
* NVIDIA Rapids

63 changes: 38 additions & 25 deletions software/applications/pytorch.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,44 +6,57 @@ PyTorch
`PyTorch <https://pytorch.org/>`__ is an end-to-end machine learning framework.
PyTorch enables fast, flexible experimentation and efficient production through a user-friendly front-end, distributed training, and ecosystem of tools and libraries.

The main method of distribution for PyTorch is via :ref:`Conda <software-applications-conda>`.
The main method of distribution for PyTorch is via :ref:`Conda <software-applications-conda>`, with :ref:`Open-CE<software-applications-open-ce>` providing a simple method for installing multiple machine learning frameworks into a single conda environment.

For more information on the usage of PyTorch, see the `Online Documentation <https://pytorch.org/docs/>`__.
The upstream Conda and pip distributions do not provide ppc64le pytorch packages at this time.

PyTorch Quickstart
~~~~~~~~~~~~~~~~~~
Installing via Conda
~~~~~~~~~~~~~~~~~~~~

With a working Conda installation (see :ref:`Installing Miniconda<software-applications-conda-installing>`) the following instructions can be used to create a Python 3.9 conda environment named ``torch`` with the latest Open-CE provided PyTorch:

.. note::

Pytorch installations via conda can be relatively large. Consider installing your miniconda (and therfore your conda environments) to the ``/nobackup`` file store.

The following should get you set up with a working conda environment (replacing <project> with your project code):

.. code-block:: bash

export DIR=/nobackup/projects/<project>/$USER
# rm -rf ~/.conda ~/.condarc $DIR/miniconda # Uncomment if you want to remove old env
mkdir $DIR
pushd $DIR
# Create a new conda environment named torch-env within your conda installation
conda create -y --name torch-env python=3.8

# Activate the conda environment
conda activate torch-env

# Add the OSU Open-CE conda channel to the current environment config
conda config --env --prepend channels https://ftp.osuosl.org/pub/open-ce/current/

# Download the latest miniconda installer for ppcle64
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-ppc64le.sh
# Validate the file checksum matches is listed on https://docs.conda.io/en/latest/miniconda_hashes.html.
sha256sum Miniconda3-latest-Linux-ppc64le.sh
# Also use strict channel priority
conda config --env --set channel_priority strict

# Install the latest available version of PyTorch
conda install -y pytorch

In subsequent interactive sessions, and when submitting batch jobs which use PyTorch, you will then need to re-activate the conda environment.

For example, to verify that PyTorch is available and print the version:

.. code-block:: bash

sh Miniconda3-latest-Linux-ppc64le.sh -b -p $DIR/miniconda
source miniconda/bin/activate
conda update conda -y
conda config --set channel_priority strict
# Activate the conda environment
conda activate torch-env

conda config --prepend channels \
https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
# Invoke python
python3 -c "import torch;print(torch.__version__)"

conda config --prepend channels \
https://opence.mit.edu

conda create --name opence pytorch=1.7.1 -y
conda activate opence
Installation via the upstream Conda channel is not currently possible, due to the lack of ``ppc64le`` or ``noarch`` distributions.


This has some limitations such as not supporting large model support.
If you require LMS, please see the :ref:`WMLCE <software-applications-wmlce>` page.
.. note::

The :ref:`Open-CE<software-applications-open-ce>` distribution of PyTorch does not include IBM technologies such as DDL or LMS, which were previously available via :ref:`WMLCE<software-applications-wmlce>`.
WMLCE is not supported on RHEL 8.


Further Information
Expand Down
61 changes: 38 additions & 23 deletions software/applications/tensorflow.rst
Original file line number Diff line number Diff line change
@@ -1,43 +1,58 @@
.. _software-python-tensorflow:
.. _software-applications-tensorflow:

TensorFlow
----------

`TensorFlow <https://www.tensorflow.org/>`__ is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML powered applications.

TensorFlow Quickstart
~~~~~~~~~~~~~~~~~~~~~
TensorFlow can be installed through a number of python package managers such as :ref:`Conda<software-applications-conda>` or ``pip``.

For use on Bede, the simplest method is to install TensorFlow using the :ref:`Open-CE Conda distribution<software-applications-open-ce>`.


Installing via Conda (Open-CE)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

With a working Conda installation (see :ref:`Installing Miniconda<software-applications-conda-installing>`) the following instructions can be used to create a Python 3.8 conda environment named ``tf-env`` with the latest Open-CE provided TensorFlow:

.. note::

TensorFlow installations via conda can be relatively large. Consider installing your miniconda (and therfore your conda environments) to the ``/nobackup`` file store.

The following should get you set up with a working conda environment (replacing ``<project>`` with your project code):

.. code-block:: bash

export DIR=/nobackup/projects/<project>/$USER
# rm -rf ~/.conda ~/.condarc $DIR/miniconda # Uncomment if you want to remove old env
mkdir $DIR
pushd $DIR
# Create a new conda environment named tf-env within your conda installation
conda create -y --name tf-env python=3.8

# Download the latest miniconda installer for ppcle64
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-ppc64le.sh
# Validate the file checksum matches is listed on https://docs.conda.io/en/latest/miniconda_hashes.html.
sha256sum Miniconda3-latest-Linux-ppc64le.sh
# Activate the conda environment
conda activate tf-env

sh Miniconda3-latest-Linux-ppc64le.sh -b -p $DIR/miniconda
source miniconda/bin/activate
conda update conda -y
# Add the OSU Open-CE conda channel to the current environment config
conda config --env --prepend channels https://ftp.osuosl.org/pub/open-ce/current/

conda config --prepend channels \
https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/
# Also use strict channel priority
conda config --env --set channel_priority strict

# Install the latest available version of Tensorflow
conda install -y tensorflow

In subsequent interactive sessions, and when submitting batch jobs which use TensorFlow, you will then need to re-activate the conda environment.

For example, to verify that TensorFlow is available and print the version:

.. code-block:: bash

conda config --prepend channels \
https://opence.mit.edu
# Activate the conda environment
conda activate tf-env

conda create --name opence tensorflow -y
conda activate opence
# Invoke python
python3 -c "import tensorflow;print(tensorflow.__version__)"

.. note::

This conflicts with the :ref:`PyTorch <software-applications-pytorch>` instructions as they set the conda channel_priority to be strict which seems to cause issues when installing TensorFlow.

The :ref:`Open-CE<software-applications-open-ce>` distribution of TensorFlow does not include IBM technologies such as DDL or LMS, which were previously available via :ref:`WMLCE<software-applications-wmlce>`.
WMLCE is not supported on RHEL 8.

Further Information
~~~~~~~~~~~~~~~~~~~
Expand Down
Loading