Skip to content

Commit

Permalink
Docs/update (#581)
Browse files Browse the repository at this point in the history
  • Loading branch information
ahellander authored Apr 18, 2024
1 parent 2c06fc8 commit d59d6d8
Show file tree
Hide file tree
Showing 24 changed files with 381 additions and 383 deletions.
9 changes: 5 additions & 4 deletions .ci/tests/examples/print_logs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@ echo "Combiner logs"
docker logs "$(basename $PWD)-combiner-1"

echo "Client 1 logs"
docker logs "$(basename $PWD)-client-1"

echo "Client 2 logs"
docker logs "$(basename $PWD)-client-2"
if [ "$example" == "mnist-keras" ]; then
docker logs "$(basename $PWD)-client-1"
else
docker logs "$(basename $PWD)-client1-1"
fi
15 changes: 11 additions & 4 deletions .ci/tests/examples/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,17 @@ pushd "examples/$example"
"../../.$example/bin/fedn" package create --path client
"../../.$example/bin/fedn" run build --path client

docker compose \
-f ../../docker-compose.yaml \
-f docker-compose.override.yaml \
up -d --build --scale client=1
if [ "$example" == "mnist-keras" ]; then
docker compose \
-f ../../docker-compose.yaml \
-f docker-compose.override.yaml \
up -d --build --scale client=1
else
docker compose \
-f ../../docker-compose.yaml \
-f docker-compose.override.yaml \
up -d --build combiner api-server mongo minio client1
fi

>&2 echo "Wait for reducer to start"
python ../../.ci/tests/examples/wait_for.py reducer
Expand Down
46 changes: 1 addition & 45 deletions .github/workflows/build-containers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,40 +35,14 @@ jobs:
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=sha
- name: Docker meta mnist-keras
id: meta2
uses: docker/metadata-action@v4
with:
images: |
docker.pkg.github.com/${{ github.repository }}/fedn
tags: |
type=ref,event=branch,suffix=-mnist-keras
type=ref,event=pr,suffix=-mnist-keras
type=semver,pattern={{version}},suffix=-mnist-keras
type=semver,pattern={{major}}.{{minor}},suffix=-mnist-keras
type=sha,suffix=-mnist-keras
- name: Docker meta mnist-pytorch
id: meta3
uses: docker/metadata-action@v4
with:
images: |
docker.pkg.github.com/${{ github.repository }}/fedn
tags: |
type=ref,event=branch,suffix=-mnist-pytorch
type=ref,event=pr,suffix=-mnist-pytorch
type=semver,pattern={{version}},suffix=-mnist-pytorch
type=semver,pattern={{major}}.{{minor}},suffix=-mnist-pytorch
type=sha,suffix=-mnist-pytorch
- name: Log in to GitHub Container Registry
uses: docker/login-action@v2
with:
registry: docker.pkg.github.com
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}


- name: Build and push
uses: docker/build-push-action@v4
Expand All @@ -77,21 +51,3 @@ jobs:
tags: ${{ steps.meta1.outputs.tags }}
labels: ${{ steps.meta1.outputs.labels }}
file: Dockerfile

- name: Build and push (mnist-keras)
uses: docker/build-push-action@v4
with:
push: "${{ github.event_name != 'pull_request' }}"
tags: ${{ steps.meta2.outputs.tags }}
labels: ${{ steps.meta2.outputs.labels }}
file: Dockerfile
build-args: |
REQUIREMENTS=examples/mnist-keras/requirements.txt
- name: Build and push (mnist-pytorch)
uses: docker/build-push-action@v4
with:
push: "${{ github.event_name != 'pull_request' }}"
tags: ${{ steps.meta3.outputs.tags }}
labels: ${{ steps.meta3.outputs.labels }}
file: Dockerfile
1 change: 1 addition & 0 deletions .github/workflows/code-checks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ jobs:
--exclude-dir='docs'
--exclude-dir='flower-client'
--exclude='tests.py'
--exclude='README.rst'
'^[ \t]+(import|from) ' -I .
# TODO: add linting/formatting for all file types
2 changes: 1 addition & 1 deletion .github/workflows/integration-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
to_test:
- "mnist-keras numpyhelper"
- "mnist-pytorch numpyhelper"
python_version: ["3.9","3.10", "3.11"]
python_version: ["3.8","3.9","3.10", "3.11"]
os:
- ubuntu-22.04
runs-on: ${{ matrix.os }}
Expand Down
8 changes: 4 additions & 4 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Getting started

The best way to get started is to take the quickstart tutorial:

- `Quickstart <https://fedn.readthedocs.io/en/latest/quickstart.html>`__
- `Quickstart <https://fedn.readthedocs.io/en/stable/quickstart.html>`__

Documentation
=============
Expand All @@ -72,8 +72,8 @@ Running your project in FEDn Studio (SaaS or on-premise)

The FEDn Studio SaaS is free for development, testing and research (one project per user, backend compute resources sized for dev/test):

- `Register for a free account in FEDn Studio <https://studio.scaleoutsystems.com/signup/>`__
- `Take the tutorial to deploy your project on FEDn Studio <https://guide.scaleoutsystems.com/#/docs>`__
- `Register for a free account in FEDn Studio <https://fedn.scaleoutsystems.com/signup/>`__
- `Take the tutorial to deploy your project on FEDn Studio <https://fedn.readthedocs.io/en/stable/studio.html>`__

Scaleout can also support users to scale up experiments and demonstrators on Studio, by granting custom resource quotas. Additonally, charts are available for self-managed deployment on-premise or in your cloud VPC (all major cloud providers). Contact the Scaleout team for more information.

Expand All @@ -91,7 +91,7 @@ Making contributions

All pull requests will be considered and are much appreciated. For
more details please refer to our `contribution
guidelines <https://github.com/scaleoutsystems/fedn/blob/develop/CONTRIBUTING.md>`__.
guidelines <https://github.com/scaleoutsystems/fedn/blob/master/CONTRIBUTING.md>`__.

Citation
========
Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Compose schema version
version: '3.3'
version: '3.4'

# Setup network
networks:
Expand Down
1 change: 1 addition & 0 deletions docs/_static/css/text.css
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ body {

a {
color: var(--scaleout-black);
font-weight: bold;
text-decoration: none;
display: inline-block;
}
Expand Down
3 changes: 2 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@
'sphinx.ext.mathjax',
'sphinx.ext.ifconfig',
'sphinx.ext.viewcode',
'sphinx_rtd_theme'
'sphinx_rtd_theme',
'sphinx_code_tabs'
]

# The master toctree document.
Expand Down
8 changes: 4 additions & 4 deletions docs/distributed.rst
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
Distributed deployment
======================
Self-managed distributed deployment
===================================

This tutorial outlines the steps for deploying the FEDn framework over a **local network**, using a single workstation or laptop as
the host, and different devices as clients. For general steps on how to run FEDn, see one of the quickstart tutorials.
the host for the servier-side components, and other hosts or devices as clients. For general steps on how to run FEDn, see the quickstart tutorials.


.. note::
For a secure and production-grade deployment solution over **public networks**, explore the FEDn Studio service at
**studio.scaleoutsystems.com**.
**fedn.scaleoutsystems.com**.

Alternatively follow this tutorial substituting the hosts local IP with your public IP, open the neccesary
ports (see which ports are used in docker-compose.yaml), and ensure you have taken additional neccesary security
Expand Down
29 changes: 9 additions & 20 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,17 +19,6 @@ However, during development of a new model it will be necessary to reinitialize.
2. Restart the clients.

Q: Can I skip fetching the remote package and instead use a local folder when developing the compute package
------------------------------------------------------------------------------------------------------------

Yes, to facilitate interactive development of the compute package you can start a client that uses a local folder 'client' in your current working directory by:

.. code-block:: bash
fedn run client --remote=False -in client.yaml
Note that in production federations this options should in most cases be disallowed.

Q: How can other aggregation algorithms can be defined?
-------------------------------------------------------
Expand All @@ -39,10 +28,10 @@ There is a plugin interface for extending the framework with new aggregators. Se
:ref:`agg-label`


Q: What is needed to include other ML frameworks in FEDn like sklearn, xgboost, etc.?
Q: What is needed to include additional ML frameworks in FEDn?
-------------------------------------------------------------------------------------

You need to make sure that FEDn knows how to serialize and deseralize the model object into paramters. If you can
You need to make sure that FEDn knows how to serialize and deserialize the model object. If you can
serialize to a list of numpy ndarrays in your compute package entrypoint (see the Quickstart Tutorial code), you
can use the built in "numpyhelper". If this is not possible, you can extend the framework with a custom helper,
see the section about model marshaling:
Expand All @@ -62,27 +51,27 @@ Yes! You can toggle which message streams a client subscibes to when starting th
Q: How do you approach the question of output privacy?
----------------------------------------------------------------------------------

We take security in (federated) machine learning very seriously. Federated learning is a foundational technology that impoves input privacy
We take security in (federated) machine learning seriously. Federated learning is a foundational technology that impoves input privacy
in machine learning by allowing datasets to stay local and private, and not copied to a server. FEDn is designed to provide an industry grade
implementation of the core communication and aggregration layers of federated learning, as well as configurable modules for traceability, logging
etc, to allow the developer balance between privacy and auditability. With `FEDn Studio <https://scaleoutsystems.com/framework>`__ we add
functionality for user authentication, authorization, and federated client identity management. As such, The FEDn Framework provides
a comprehensive software suite for implemeting secure federated learning following industry best-practices.

Going beyond input privacy, there are several additional considerations relating to output privacy and potential attacks on (federated) machine learning systems. For an
introduction to the topic, see this blog post:
Going beyond input privacy, there are several additional considerations relating to output privacy and potential attacks on (federated) machine learning systems.
For an introduction to the topic, see this blog post:

- `Output Privacy and Federated Machine Learning <https://www.scaleoutsystems.com/post/output-privacy-and-federated-machine-learning>`__

Striking the appropriate balance between system complexity and secturity becomes a use-case dependent endeavor, and we are happy to
engage in detailed conversations about this. As an example, one might consider layering differential privacy on top of the aggregation
to protect against a honest-but-curious server, at the price of a loss of accuracy for the global model. Depending on the privacy requirements,
Striking the appropriate balance between system complexity and security becomes a use-case dependent endeavor, and we are happy to
support projects with guidance on these matters. For an example, one might consider layering differential privacy on top of the aggregation
to protect against an honest-but-curious server, at the price of a reduced accuracy for the global model. Depending on the privacy requirements,
the model type, the amount of data, the number of local updates possible during training etc, this may or may not be necessary.

We are engaged in several cybersecurity projects focused on federated machine learning, do not hesitate to reach out to discuss further
with the Scaleout team.

- `LEAKPRO: Leakage Profiling and Risk Oversight for Machine Learning Models <https://www.vinnova.se/en/p/leakpro-leakage-profiling-and-risk-oversight-for-machine-learning-models/>`__
- `Validating a System Development Kit for edge federated learning <https://www.vinnova.se/en/p/validating-a-system-development-kit-for-edge-federated-learning/>`__
- `Truseted Execution Environments for Federated Learning: <https://www.vinnova.se/en/p/trusted-execution-environments-for-federated-learning/>`__
- `Trusted Execution Environments for Federated Learning: <https://www.vinnova.se/en/p/trusted-execution-environments-for-federated-learning/>`__
- `Robust IoT Security: Intrusion Detection Leveraging Contributions from Multiple Systems <https://www.vinnova.se/en/p/robust-iot-security-intrusion-detection-leveraging-contributions-from-multiple-systems/>`__
4 changes: 2 additions & 2 deletions docs/helpers.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
.. _helper-label:

Model Serialization/Deserialization - Helpers
=============================================
Model Serialization/Deserialization
===================================

In federated learning, model updates need to be serialized and deserialized in order to be
transferred between clients and server/combiner. There is also a need to write and load models
Expand Down
62 changes: 30 additions & 32 deletions docs/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,53 +7,51 @@ Federated Learning allows for collaborative model training while keeping data lo
scenarios where data cannot be easily shared due to privacy regulations, network limitations, or ownership concerns.

At its core, Federated Learning orchestrates model training across distributed devices or servers, referred to as clients or participants.
These participants could be diverse endpoints such as mobile devices, IoT gadgets, or remote servers. Rather than transmitting raw data to a central location,
These participants could be diverse endpoints such as mobile devices, IoT gateways, or remote servers. Rather than transmitting raw data to a central location,
each participant computes gradients locally based on its data. These gradients are then communicated to a server, often called the aggregator.
The server aggregates and combines the gradients from multiple participants to update a global model.
This iterative process allows the global model to improve without the need to share the raw data.

**FEDn: the SDK for scalable federated learning**
FEDn empowers users to create federated learning applications that seamlessly transition from local proofs-of-concept to secure distributed deployments.
We develop the FEDn framework following these core design principles:

FEDn serves as a System Development Kit (SDK) enabling scalable federated learning.
It is used to implement the core server side logic (including model aggregation) and the client side integrations.
Developers and ML engineers can use FEDn to build custom federated learning systems and bespoke deployments.
- **Seamless transition from proof-of-concepts to real-world FL**. FEDn has been designed to make the journey from R&D to real-world deployments as smooth as possibe. Develop your federated learning use case in a pseudo-local environment, then deploy it to FEDn Studio (cloud or on-premise) for real-world scenarios. No code change is required to go from development and testing to production.

- **Designed for scalability and resilience.** FEDn enables model aggregation through multiple aggregation servers sharing the workload. A hierarchical architecture makes the framework well suited borh for cross-silo and cross-device use-cases. FEDn seamlessly recover from failures in all critical components, and manages intermittent client-connections, ensuring robust deployment in production environments.

One of the standout features of FEDn is its ability to deploy and scale the server-side in geographically distributed setups,
adapting to varying project needs and geographical considerations.
- **Secure by design.** FL clients do not need to open any ingress ports, facilitating distributed deployments across a wide variety of settings. Additionally, FEDn utilizes secure, industry-standard communication protocols and supports token-based authentication and RBAC for FL clients (JWT), providing flexible integration in production environments.

- **Developer and data scientist friendly.** Extensive event logging and distributed tracing enables developers to monitor experiments in real-time, simplifying troubleshooting and auditing. Machine learning metrics can be accessed via both a Python API and visualized in an intuitive UI that helps the data scientists analyze and communicate ML-model training progress.

**Scalable and Resilient**

FEDn exhibits scalability and resilience, thanks to its tiered architecture. Multiple aggregation servers, in FEDn called combiners,
form a network to divide the workload of coordinating clients and aggregating models.
This architecture allows for high performance in various settings, from thousands of clients in a cross-device environment to
large model updates in a cross-silo scenario. Importantly, FEDn has built-in recovery capabilities for all critical components, enhancing system reliability.
Features
=========

**ML-Framework Agnostic**
Federated machine learning:

With FEDn, model updates are treated as black-box computations, meaning it can support any ML model type or framework.
This flexibility allows for out-of-the-box support for popular frameworks like Keras and PyTorch, making it a versatile tool for any machine learning project.
- Support for any ML framework (e.g. PyTorch, Tensforflow/Keras and Scikit-learn)
- Extendable via a plug-in architecture (aggregators, load balancers, object storage backends, databases etc.)
- Built-in federated algorithms (FedAvg, FedAdam, FedYogi, FedAdaGrad, etc.)
- CLI and Python API client for running FEDn networks and coordinating experiments.
- Implement clients in any language (Python, C++, Kotlin etc.)
- No open ports needed client-side.

**Security**

A key security feature of FEDn is its client protection capabilities - clients do not need to expose any ingress ports,
thus reducing potential security vulnerabilities.
FEDn Studio - From development to FL in production:

**Event Tracking and Training progress**
- Leverage Scaleout's free managed service for development and testing in real-world scenarios (SaaS).
- Token-based authentication (JWT) and role-based access control (RBAC) for FL clients.
- REST API and UI.
- Data science dashboard for orchestrating experiments and visualizing results.
- Admin dashboard for managing the FEDn network and users/clients.
- View extensive logging and tracing information.
- Collaborate with other data-scientists on the project specification in a shared workspace.
- Cloud or on-premise deployment (cloud-native design, deploy to any Kubernetes cluster)

To ensure transparency and control over the training process, as well as to provide means to troubleshoot distributed deployments,
FEDn logs events and does real-time tracking of training progress. A flexible API lets the user define validation strategies locally on clients.
Data is logged as JSON to MongoDB, enabling users to create custom dashboards and visualizations easily.
Support
=========

**REST-API and Python API Client and CLI**
Community support in available in our `Discord
server <https://discord.gg/KMg4VwszAd>`__.

FEDn comes with an REST-API, a CLI and a Python API Client for programmatic interaction with a FEDn network. This allows for flexible automation of experiments, for integration with
other systems, and for easy integration with external dashboards and visualization tools.

FEDn Studio
-----------

FEDn Studio is a web-based tool for managing and monitoring federated learning experiments. It provides the FEDn network as a managed service, as well as a user-friendly interface for monitoring the progress of training and visualizing the results. FEDn Studio is available as a SaaS at fedn.scaleoutsystems.com . It is free for development, testing and research (one project per user, backend compute resources sized for dev/test).

Scaleout can also support users to scale up experiments and demonstrators on Studio, by granting custom resource quotas. Additonally, charts are available for self-managed deployment on-premise or in your cloud VPC (all major cloud providers). Contact the Scaleout team for more information.
Options are available for `Enterprise support <https://www.scaleoutsystems.com/start#pricing>`__.
Loading

0 comments on commit d59d6d8

Please sign in to comment.