Skip to content

Commit

Permalink
Improved quickstart
Browse files Browse the repository at this point in the history
  • Loading branch information
Andreas Hellander committed Jul 4, 2024
1 parent 96da4ee commit fa4b928
Show file tree
Hide file tree
Showing 4 changed files with 80 additions and 45 deletions.
2 changes: 2 additions & 0 deletions docs/architecture.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.. _architecture-label:

Architecture overview
=====================

Expand Down
4 changes: 2 additions & 2 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,13 @@
:caption: Introduction

introduction
quickstart
projects

.. toctree::
:maxdepth: 1
:caption: Documentation

quickstart
projects
studio
apiclient
architecture
Expand Down
22 changes: 13 additions & 9 deletions docs/projects.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
.. _projects-label:

Creating your own FEDn Projects
Building your own projects
================================================

This guide explains how a FEDn project is structured, and details how to develop your own
projects for your own use-cases.

A FEDn project is a convention for packaging/wrapping machine learning code to be used for federated learning with FEDn. At the core,
a project is a directory of files (often a Git repository), containing your machine learning code, FEDn entry points, and a specification
of the runtime environment (python environment or a Docker image). The FEDn API and command-line tools provides functionality
Expand All @@ -28,9 +31,9 @@ We recommend that projects have roughly the following folder and file structure:
| └ Dockerfile / docker-compose.yaml
|
The "client" folder is referred to as the *compute package*. The file fedn.yaml is the FEDn Project File. It informs the FEDn Client of the code entry points to execute when computing model updates (local training) and validating models (optionally) .
When deploying the project to FEDn, the client folder will be compressed as a .tgz bundle and uploaded to the FEDn controller. FEDn can then manage the distribution of the compute package to each client/data provider when they connect.
Upon recipt of the bundle, the client will unpack it and stage it locally.
The ``client`` folder is commonly referred to as the *compute package*. The file ``fedn.yaml`` is the FEDn Project File. It contains information about the ``entry points``. The entry points are used by the client to compute model updates (local training) and local validations (optional) .
To run a project in FEDn, the client folder is compressed as a .tgz bundle and pushed to the FEDn controller. FEDn then manages the distribution of the compute package to each client.
Upon recipt of the package, a client will unpack it and stage it locally.

.. image:: img/ComputePackageOverview.png
:alt: Compute package overview
Expand Down Expand Up @@ -62,11 +65,12 @@ what environment to execute those entrypoints in.
Environment
^^^^^^^^^^^

The software environment to be used to exectute the entry points. This should specify all client side dependencies of the project.
FEDn currently supports Virtualenv environments, with packages on PyPI. When a project specifies a **python_env**, the FEDn
client will create an isolated virtual environment and install the project dependencies into it before starting up the client.

It is assumed that all entry points are executable within the client runtime environment. As a user, you have two main options
to specify the environment:

1. Provide a ``python_env`` in the ``fedn.yaml`` file. In this case, FEDn will create an isolated virtual environment and install the project dependencies into it before starting up the client. FEDn currently supports Virtualenv environments, with packages on PyPI.
2. Manage the environment manually. Here you have several options, such as managing your own virtualenv, running in a Docker container, etc. Remove the ``python_env`` tag from ``fedn.yaml`` to handle the environment manually.

Entry Points
^^^^^^^^^^^^
Expand All @@ -75,7 +79,7 @@ There are up to four Entry Points to be specified.

**Build Entrypoint (build, optional):**

This entrypoint is usually called **once** for building artifacts such as initial seed models. However, it not limited to artifacts, and can be used for any kind of setup that needs to be done before the client starts up.
This entrypoint is intended to be called **once** for building artifacts such as initial seed models. However, it not limited to artifacts, and can be used for any kind of setup that needs to be done before the client starts up.

**Startup Entrypoint (startup, optional):**

Expand Down
97 changes: 63 additions & 34 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Getting started with FEDn
=========================

.. note::
This tutorial is a quickstart guide to FEDn based on a pre-made FEDn Project. It is designed to serve as a minimalistic starting point for developers.
This tutorial is a quickstart guide to FEDn based on a pre-made FEDn Project. It is designed to serve as a starting point for new developers.
To learn about FEDn Projects in order to develop your own federated machine learning projects, see :ref:`projects-label`.

**Prerequisites**
Expand All @@ -11,7 +11,7 @@ Getting started with FEDn
- `A FEDn Studio account <https://fedn.scaleoutsystems.com/signup>`__


Set up a FEDn Studio Project
Start a FEDn Studio Project
----------------------------

Start by creating an account in Studio. Head over to `fedn.scaleoutsystems.com/signup <https://fedn.scaleoutsystems.com/signup/>`_ and sign up.
Expand All @@ -23,10 +23,13 @@ Logged into Studio, do:
3. Enter the project name (mandatory). The project description is optional.
4. Click the "Create" button to create the project.

You Studio project provides all server side components. Next, you will set up your local machine / client and create a FEDn project.
.. image:: img/studio_project_overview.png

Install FEDn
------------
When these steps are complete, you will see a Studio project similar to the above image. The Studio project provides all server side components of FEDn needed to manage
federated training. We will use this project in a later stage to run the federated experiments. But first, we will set up the local client.

Install FEDn on your client
----------------------------

**Using pip**

Expand All @@ -50,35 +53,45 @@ It is recommended to use a virtual environment when installing FEDn.

.. _package-creation:

Initialize FEDn with the client code bundle and seed model
----------------------------------------------------------

Next, we will prepare the client. The key part of a FEDn Project is the client definition -
code that contains entrypoints for training and (optionally) validating a model update on the client.
Create the compute package and seed model
--------------------------------------------

Next, we will prepare the client. For illustrative purposes, we use one of the pre-defined projects in the FEDn repository, ``minst-pytorch``.

Locate into ``examples/mnist-pytorch`` and familiarize yourself with the project structure. The dependencies needed in the client environment are specified
in ``client/python_env.yaml``.
In order to train a federated model using FEDn, your Studio project needs to be initialized with a ``compute package`` and a ``seed model``. The compute package is a code bundle containing the
code used by the client to execute local training and local validation. The seed model is a first version of the global model.

In order to train a federated model using FEDn, your Studio project needs to be initialized with a compute package and a seed model. The compute package is a bundle
of the client specification, and the seed model is a first version of the global model.
Locate into ``examples/mnist-pytorch`` folder in the cloned fedn repository. The compute package is located in the folder ``client``.

Create a package of the fedn project (assumes your current working directory is in the root of the project /examples/mnist-pytorch):
Create a package of the fedn project. Standing in ``examples/mnist-pytorch``:

.. code-block::
fedn package create --path client
This will create a package called 'package.tgz' in the root of the project.
This will create a package called ``package.tgz`` in the root of the project.

Next, run the build entrypoint defined in ``client/fedn.yaml`` to build the model artifact.
Next, create the seed model:

.. code-block::
fedn run build --path client
This will create a seed model called 'seed.npz' in the root of the project. We will now upload these to your Studio project using the FEDn APIClient.
This will create a seed model called ``seed.npz`` in the root of the project. We will now upload these to your Studio project using the FEDn APIClient.

For a detailed explaination of the FEDn Project with instructions for how to create your own project, see this guide: :ref:`projects-label`

Initialize your FEDn Studio Project
------------------------------------

In the Studio UI, navigate to the project you created above and click on the "Sessions" tab. Click on the "New Session" button. Under the "Compute package" tab, select a name and upload the generated package file. Under the "Seed model" tab, upload the generated seed file:

**Upload the package and seed model**
.. image:: img/upload_package.png

**Upload the package and seed model using the Python APIClient**

It is also possible to upload a package and seed model using the Python API Client.

.. note::
You need to create an API admin token and use the token to authenticate the APIClient.
Expand All @@ -100,17 +113,17 @@ Upload the package and seed model using the APIClient:
Configure and attach clients
----------------------------

Each local client needs an access token in order to connect. These tokens are issued from your Studio Project. Go to the 'Clients' tab and click 'Connect client'.
Download a client configuration file and save it to the root of the examples/mnist-pytorch folder. Rename the file to 'client.yaml'.
Then start the client by running the following command in the root of the project:
Each local client needs an access token in order to connect. These tokens are issued from your Studio Project. Go to the Clients' tab and click 'Connect client'.
Download a client configuration file and save it to the root of the ``examples/mnist-pytorch folder``. Rename the file to 'client.yaml'.
Then start the client by running the following command:

.. code-block::
fedn run client -in client.yaml --secure=True --force-ssl
Repeat the above for the number of clients you want to use. A normal laptop should be able to handle several clients for this example.

**Modifying the data split:**
**Modifying the data split (multiple-clients, optional):**

The default traning and test data for this example (MNIST) is for convenience downloaded and split by the client when it starts up (see 'startup' entrypoint).
The number of splits and which split is used by a client can be controlled via the environment variables ``FEDN_NUM_DATA_SPLITS`` and ``FEDN_DATA_PATH``.
Expand Down Expand Up @@ -138,7 +151,21 @@ For example, to split the data in 10 parts and start a client using the 8th part
Start a training session
------------------------

You are now ready to start training the model using the APIClient:
In Studio click on the "Sessions" link, then the "New session" button in the upper right corner. Click the "Start session" tab and enter your desirable settings (or use default) and hit the "Start run" button. In the terminal where your are running your client you should now see some activity. When the round is completed, you can see the results in the FEDn Studio UI on the "Models" page.

**Watch the training progress**

Once a training session is started, you can monitor the progress of the training by navigating to "Sessions" and click on the "Open" button of the active session. The session page will list the models as soon as they are generated. To get more information about a particular model, navigate to the model page by clicking the model name. From the model page you can download the model weights and get validation metrics.

To get an overview of how the models have evolved over time, navigate to the "Models" tab in the sidebar. Here you can see a list of all models generated across sessions along with a graph showing some metrics of how the models are performing.

.. image:: img/studio_model_overview.png

.. _studio-api:

**Control training sessions using the Python APIClient**

You can also issue training sessions using the APIClient:

.. code:: python
Expand All @@ -153,12 +180,7 @@ You are now ready to start training the model using the APIClient:
>>> validations = client.get_validations(model_id=model_id)
Please see :py:mod:`fedn.network.api` for more details on the APIClient.

.. note::

In FEDn Studio, you can start a training session by going to the 'Sessions' tab and click 'Start session'. See :ref:`studio` for a
step-by-step guide for how to control experiments using the UI.
Please see :py:mod:`fedn.network.api` for more details on how to use the APIClient.

Access model updates
--------------------
Expand All @@ -167,15 +189,16 @@ Access model updates
In FEDn Studio, you can access global model updates by going to the 'Models' or 'Sessions' tab. Here you can download model updates, metrics (as csv) and view the model trail.


You can access global model updates via the APIClient:
You can also access global model updates via the APIClient:

.. code:: python
>>> ...
>>> client.download_model("<model-id>", path="model.npz")
**Connecting clients using Docker**
Connecting clients using Docker
--------------------------------

You can also use Docker to containerize the client.
For convenience, there is a Docker image hosted on ghrc.io with fedn preinstalled.
Expand All @@ -188,12 +211,18 @@ To start a client using Docker:
-e FEDN_PACKAGE_EXTRACT_DIR=package \
-e FEDN_NUM_DATA_SPLITS=2 \
-e FEDN_DATA_PATH=/app/package/data/clients/1/mnist.pt \
ghcr.io/scaleoutsystems/fedn/fedn:0.9.0 run client -in client.yaml --force-ssl --secure=True
ghcr.io/scaleoutsystems/fedn/fedn:0.10.0 run client -in client.yaml --force-ssl --secure=True
**Where to go from here?**
Where to go from here?
------------------------

With you first FEDn federation set up, we suggest that you take a close look at how a FEDn project is structured
With you first FEDn federated project set up, we suggest that you take a close look at how a FEDn project is structured
and how you develop your own FEDn projects:

- :ref:`projects-label`

You can also dive into the architecture overview to learn more about how FEDn is designed and works under the hood:
- :ref:`architecture-label`


0 comments on commit fa4b928

Please sign in to comment.