Add new tutorial example to OpenFL interactive API (securefederatedai…

…#812) * Add new tutorial example to OpenFL interactive API This adds a new tutorial example on distributing a linear regression task over OpenFL cluster The model is defined by scikit-learn which is able to run over both cpu (by default) and gpu. The dataset is 1-dimensional noisy data of sinusoid with pre-defined parameters. Fixes securefederatedai#798 Co-authored-by: Beverly Klemme <[email protected]> Co-authored-by: Grant Baker <[email protected]> Signed-off-by: Yi CAO <[email protected]> * reduced requirements.txt in workspace Signed-off-by: Beverly Klemme <[email protected]> --------- Signed-off-by: Yi CAO <[email protected]> Signed-off-by: Beverly Klemme <[email protected]> Co-authored-by: Yi CAO <[email protected]>
kta-intel · May 24, 2023 · 660cc02 · 660cc02
1 parent d31e475
commit 660cc02
Show file tree

Hide file tree

Showing 10 changed files with 598 additions and 0 deletions.
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/README.md b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/README.md
@@ -0,0 +1,55 @@
+# Scikit-learn based Linear Regression Tutorial
+
+### 1. About dataset
+
+Generate 1-dimensional noisy data for linear regression of sinusoid. 
+
+Define the below pamameter in shard_config in the envoy_config.yaml file as the random seed for the dataset generation for a specific Envoy 
+- rank
+
+### 2. About model
+
+Linear Regression Lasso Model based on Scikit-learn.
+
+
+### 3. How to run this tutorial (without TLC and locally as a simulation):
+
+1. Run director:
+
+```sh
+cd director folder
+./start_director.sh
+```
+
+2. Run envoy:
+
+Step 1: Activate virtual environment and install packages
+```
+cd envoy folder
+pip install -r requirements.txt
+```
+Step 2: start the envoy
+```sh
+./start_envoy.sh env_instance_1 envoy_config.yaml
+```
+
+Optional: start second envoy:
+
+- Copy `envoy_folder` to another place and follow the same process as above:
+
+```sh
+./start_envoy.sh env_instance_2 envoy_config_2.yaml
+```
+
+3. Run `scikit_learn_linear_regression.ipynb` jupyter notebook:
+
+```sh
+cd workspace
+jupyter lab scikit_learn_linear_regression.ipynb
+```
+
+4. Visualization
+
+```
+tensorboard --logdir logs/
+```
diff --git a/...fl-tutorials/interactive_api/scikit_learn_linear_regression/director/director_config.yaml b/...fl-tutorials/interactive_api/scikit_learn_linear_regression/director/director_config.yaml
@@ -0,0 +1,6 @@
+settings:
+  listen_host: localhost
+  listen_port: 50050
+  sample_shape: ['1'] # Modify this param if experimenting with `n_features` of shard_descriptor.
+  target_shape: ['1']
+  envoy_health_check_period: 5  # in seconds
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/director/start_director.sh b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/director/start_director.sh
@@ -0,0 +1,4 @@
+#!/bin/bash
+set -e
+
+fx director start --disable-tls -c director_config.yaml
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/envoy_config.yaml b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/envoy_config.yaml
@@ -0,0 +1,12 @@
+params:
+  cuda_devices: []
+
+optional_plugin_components: {}
+
+shard_descriptor:
+  template: linreg_shard_descriptor.LinRegSD
+  params:
+    rank: 1
+    n_samples: 80
+    noise: 0.15
+
diff --git a/...tutorials/interactive_api/scikit_learn_linear_regression/envoy/linreg_shard_descriptor.py b/...tutorials/interactive_api/scikit_learn_linear_regression/envoy/linreg_shard_descriptor.py
@@ -0,0 +1,60 @@
+# Copyright (C) 2020-2021 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+"""Noisy-Sin Shard Descriptor."""
+
+from typing import List
+
+import numpy as np
+
+from openfl.interface.interactive_api.shard_descriptor import ShardDescriptor
+
+
+class LinRegSD(ShardDescriptor):
+    """Shard descriptor class."""
+
+    def __init__(self, rank: int, n_samples: int = 10, noise: float = 0.15) -> None:
+        """
+        Initialize LinReg Shard Descriptor.
+
+        This Shard Descriptor generate random data. Sample features are
+        floats between pi/3 and 5*pi/3, and targets are calculated
+        calculated as sin(feature) + normal_noise.
+        """
+        np.random.seed(rank)  # Setting seed for reproducibility
+        self.n_samples = max(n_samples, 5)
+        self.interval = 240
+        self.x_start = 60
+        x = np.random.rand(n_samples, 1) * self.interval + self.x_start
+        x *= np.pi / 180
+        y = np.sin(x) + np.random.normal(0, noise, size=(n_samples, 1))
+        self.data = np.concatenate((x, y), axis=1)
+
+    def get_dataset(self, dataset_type: str) -> np.ndarray:
+        """
+        Return a shard dataset by type.
+
+        A simple list with elements (x, y) implemets the Shard Dataset interface.
+        """
+        if dataset_type == 'train':
+            return self.data[:self.n_samples // 2]
+        elif dataset_type == 'val':
+            return self.data[self.n_samples // 2:]
+        else:
+            pass
+
+    @property
+    def sample_shape(self) -> List[str]:
+        """Return the sample shape info."""
+        (*x, _) = self.data[0]
+        return [str(i) for i in np.array(x, ndmin=1).shape]
+
+    @property
+    def target_shape(self) -> List[str]:
+        """Return the target shape info."""
+        (*_, y) = self.data[0]
+        return [str(i) for i in np.array(y, ndmin=1).shape]
+
+    @property
+    def dataset_description(self) -> str:
+        """Return the dataset description."""
+        return 'Allowed dataset types are `train` and `val`'
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/requirements.txt b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/requirements.txt
@@ -0,0 +1,7 @@
+openfl>=1.2.1
+numpy>=1.13.3
+scikit-learn>=0.24.1
+matplotlib>=2.0.0
+mistune>=2.0.3 # not directly required, pinned by Snyk to avoid a vulnerability
+setuptools>=65.5.1 # not directly required, pinned by Snyk to avoid a vulnerability
+wheel>=0.38.0 # not directly required, pinned by Snyk to avoid a vulnerability
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/start_envoy.sh b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/envoy/start_envoy.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+set -e
+ENVOY_NAME=$1
+ENVOY_CONF=$2
+
+fx envoy start -n "$ENVOY_NAME" --disable-tls --envoy-config-path "$ENVOY_CONF" -dh localhost -dp 50050
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/workspace/custom_adapter.py b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/workspace/custom_adapter.py
@@ -0,0 +1,21 @@
+# Copyright (C) 2020-2023 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+"""Custom model numpy adapter."""
+
+from openfl.plugins.frameworks_adapters.framework_adapter_interface import (
+    FrameworkAdapterPluginInterface,
+)
+
+
+class CustomFrameworkAdapter(FrameworkAdapterPluginInterface):
+    """Framework adapter plugin class."""
+
+    @staticmethod
+    def get_tensor_dict(model, optimizer=None):
+        """Extract tensors from a model."""
+        return {'w': model.weights}
+
+    @staticmethod
+    def set_tensor_dict(model, tensor_dict, optimizer=None, device='cpu'):
+        """Load tensors to a model."""
+        model.weights = tensor_dict['w']
diff --git a/openfl-tutorials/interactive_api/scikit_learn_linear_regression/workspace/requirements.txt b/openfl-tutorials/interactive_api/scikit_learn_linear_regression/workspace/requirements.txt
@@ -0,0 +1,7 @@
+openfl>=1.2.1
+numpy>=1.13.3
+scikit-learn>=0.24.1
+matplotlib>=2.0.0
+mistune>=2.0.3 # not directly required, pinned by Snyk to avoid a vulnerability
+setuptools>=65.5.1 # not directly required, pinned by Snyk to avoid a vulnerability
+wheel>=0.38.0 # not directly required, pinned by Snyk to avoid a vulnerability