Skip to content

Commit

Permalink
Add new tutorial example to OpenFL interactive API (securefederatedai…
Browse files Browse the repository at this point in the history
…#812)

* Add new tutorial example to OpenFL interactive API

This adds a new tutorial example on distributing a linear regression task over OpenFL cluster

The model is defined by scikit-learn which is able to run over both cpu (by default) and gpu. The dataset is 1-dimensional noisy data of sinusoid with pre-defined parameters.

Fixes securefederatedai#798

Co-authored-by: Beverly Klemme <[email protected]>
Co-authored-by: Grant Baker <[email protected]>

Signed-off-by: Yi CAO <[email protected]>

* reduced requirements.txt in workspace

Signed-off-by: Beverly Klemme <[email protected]>

---------

Signed-off-by: Yi CAO <[email protected]>
Signed-off-by: Beverly Klemme <[email protected]>
Co-authored-by: Yi CAO <[email protected]>
  • Loading branch information
bjklemme-intel and yi2cao authored May 24, 2023
1 parent d31e475 commit 660cc02
Show file tree
Hide file tree
Showing 10 changed files with 598 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Scikit-learn based Linear Regression Tutorial

### 1. About dataset

Generate 1-dimensional noisy data for linear regression of sinusoid.

Define the below pamameter in shard_config in the envoy_config.yaml file as the random seed for the dataset generation for a specific Envoy
- rank

### 2. About model

Linear Regression Lasso Model based on Scikit-learn.


### 3. How to run this tutorial (without TLC and locally as a simulation):

1. Run director:

```sh
cd director folder
./start_director.sh
```

2. Run envoy:

Step 1: Activate virtual environment and install packages
```
cd envoy folder
pip install -r requirements.txt
```
Step 2: start the envoy
```sh
./start_envoy.sh env_instance_1 envoy_config.yaml
```

Optional: start second envoy:

- Copy `envoy_folder` to another place and follow the same process as above:

```sh
./start_envoy.sh env_instance_2 envoy_config_2.yaml
```

3. Run `scikit_learn_linear_regression.ipynb` jupyter notebook:

```sh
cd workspace
jupyter lab scikit_learn_linear_regression.ipynb
```

4. Visualization

```
tensorboard --logdir logs/
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
settings:
listen_host: localhost
listen_port: 50050
sample_shape: ['1'] # Modify this param if experimenting with `n_features` of shard_descriptor.
target_shape: ['1']
envoy_health_check_period: 5 # in seconds
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
set -e

fx director start --disable-tls -c director_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
params:
cuda_devices: []

optional_plugin_components: {}

shard_descriptor:
template: linreg_shard_descriptor.LinRegSD
params:
rank: 1
n_samples: 80
noise: 0.15

Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Copyright (C) 2020-2021 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
"""Noisy-Sin Shard Descriptor."""

from typing import List

import numpy as np

from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor


class LinRegSD(ShardDescriptor):
"""Shard descriptor class."""

def __init__(self, rank: int, n_samples: int = 10, noise: float = 0.15) -> None:
"""
Initialize LinReg Shard Descriptor.
This Shard Descriptor generate random data. Sample features are
floats between pi/3 and 5*pi/3, and targets are calculated
calculated as sin(feature) + normal_noise.
"""
np.random.seed(rank) # Setting seed for reproducibility
self.n_samples = max(n_samples, 5)
self.interval = 240
self.x_start = 60
x = np.random.rand(n_samples, 1) * self.interval + self.x_start
x *= np.pi / 180
y = np.sin(x) + np.random.normal(0, noise, size=(n_samples, 1))
self.data = np.concatenate((x, y), axis=1)

def get_dataset(self, dataset_type: str) -> np.ndarray:
"""
Return a shard dataset by type.
A simple list with elements (x, y) implemets the Shard Dataset interface.
"""
if dataset_type == 'train':
return self.data[:self.n_samples // 2]
elif dataset_type == 'val':
return self.data[self.n_samples // 2:]
else:
pass

@property
def sample_shape(self) -> List[str]:
"""Return the sample shape info."""
(*x, _) = self.data[0]
return [str(i) for i in np.array(x, ndmin=1).shape]

@property
def target_shape(self) -> List[str]:
"""Return the target shape info."""
(*_, y) = self.data[0]
return [str(i) for i in np.array(y, ndmin=1).shape]

@property
def dataset_description(self) -> str:
"""Return the dataset description."""
return 'Allowed dataset types are `train` and `val`'
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
openfl>=1.2.1
numpy>=1.13.3
scikit-learn>=0.24.1
matplotlib>=2.0.0
mistune>=2.0.3 # not directly required, pinned by Snyk to avoid a vulnerability
setuptools>=65.5.1 # not directly required, pinned by Snyk to avoid a vulnerability
wheel>=0.38.0 # not directly required, pinned by Snyk to avoid a vulnerability
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#!/bin/bash
set -e
ENVOY_NAME=$1
ENVOY_CONF=$2

fx envoy start -n "$ENVOY_NAME" --disable-tls --envoy-config-path "$ENVOY_CONF" -dh localhost -dp 50050
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright (C) 2020-2023 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
"""Custom model numpy adapter."""

from openfl.plugins.frameworks_adapters.framework_adapter_interface import (
FrameworkAdapterPluginInterface,
)


class CustomFrameworkAdapter(FrameworkAdapterPluginInterface):
"""Framework adapter plugin class."""

@staticmethod
def get_tensor_dict(model, optimizer=None):
"""Extract tensors from a model."""
return {'w': model.weights}

@staticmethod
def set_tensor_dict(model, tensor_dict, optimizer=None, device='cpu'):
"""Load tensors to a model."""
model.weights = tensor_dict['w']
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
openfl>=1.2.1
numpy>=1.13.3
scikit-learn>=0.24.1
matplotlib>=2.0.0
mistune>=2.0.3 # not directly required, pinned by Snyk to avoid a vulnerability
setuptools>=65.5.1 # not directly required, pinned by Snyk to avoid a vulnerability
wheel>=0.38.0 # not directly required, pinned by Snyk to avoid a vulnerability
Loading

0 comments on commit 660cc02

Please sign in to comment.