Merge pull request #36 from IBM/version1.0.2

Version 1.0.2 of IBM Federated Learning library
IBM · Nov 3, 2020 · 63c92cb · 63c92cb
2 parents 46edac9 + cb53fe8
commit 63c92cb
Show file tree

Hide file tree

Showing 11 changed files with 118 additions and 132 deletions.
diff --git a/docs/tutorials/README.md b/docs/tutorials/README.md
@@ -8,3 +8,7 @@
 
 - Learn how to create your own data handler in [Create a customized data handler](../tutorials/create_my_data_handler.md).
 
+- Learn how to load large datasets via data generators in [Set up data generators](../tutorials/set_up_data_generators_for_fl.md).
+
+- Learn how to specify quorum, maximum timeout for each round, and rejoin party after a dropout [Quorum handling and ability to Rejoin](../tutorials/quorum_rejoin.md).
+
diff --git a/docs/tutorials/configure_gpu_training.md b/docs/tutorials/configure_gpu_training.md
@@ -0,0 +1,56 @@
+# Enabling GPU training
+
+IBM federated learning offers support for training neural network models 
+under GPU environment at the party side to speedup the training process.
+
+## Environment setup
+Please install required libraries for GPU training.
+ - For Keras and TensorFlow models, install the corresponding `tensorflow-gpu` package 
+ according to [Tensorflow GPU tutorial](https://www.tensorflow.org/install/gpu). 
+ IBM FL currently requires `tensorflow==1.15.0`, therefore, 
+ you will need to install `tensorflow-gpu==1.15.0` in your GPU environment.
+
+## IBM FL configuration
+Users can enable and specify the number of GPUs they want to use for training 
+via the party's configuration file. 
+Below is an example of the party's configuration file:
+```yaml
+aggregator:
+  ip: 127.0.0.1
+  port: 5000
+connection:
+  info:
+    ip: 127.0.0.1
+    port: 8085
+    tls_config:
+      enable: false
+  name: FlaskConnection
+  path: ibmfl.connection.flask_connection
+  sync: false
+data:
+  info:
+    npz_file: examples/data/mnist/random/data_party0.npz
+  name: MnistKerasDataHandler
+  path: ibmfl.util.data_handlers.mnist_keras_data_handler
+local_training:
+  name: LocalTrainingHandler
+  path: ibmfl.party.training.local_training_handler
+model:
+  name: KerasFLModel
+  path: ibmfl.model.keras_fl_model
+  spec:
+    model_definition: examples/configs/keras_classifier/compiled_keras.h5
+    model_name: keras-cnn
+  info:
+    gpu:
+      num_gpus: 2 # enabling keras training with 2 GPUs
+protocol_handler:
+  name: PartyProtocolHandler
+  path: ibmfl.party.party_protocol_handler
+```
+In the above example, the `gpu` section under `info` section of `model` specifies 
+the `gpu` setting of party's local training. 
+Users can change the `num_gpus` according to the computing resources available to the parties.
+
+If no `gpu` section is presented in `info`, the Keras/TensorFlow.keras training will be 
+using the default CPU environment or **only one GPU** even if the party can access one or more GPU(s).
diff --git a/docs/tutorials/quorum_rejoin.md b/docs/tutorials/quorum_rejoin.md
@@ -0,0 +1,21 @@
+# Quorum handling and ability to Rejoin
+
+## Quorum handling
+IBM FL supports the functionality to specify quorum percentage in the aggregator config file to provide flexibility to parties that have potential connectivity failure. Given a total number of parties registered at a particular round, the quorum percentage defines the minimum number of parties that should reply back for that round. If for some round aggregator receives less number of replies from the parties, it will stop the federated learning process. This functionality makes sure that if for some reasons a number of parties dropout they can rejoin back as long as the available parties do not fall below the quorum value.
+
+For example in following configuration file `perc_quorum` is set to 0.75. This means that for each round aggregator will expect 75% of the registered parties to reply back. So if there are 20 parties that registered, federated learning will continue as long as not more than five parties drop out. 
+
+```
+hyperparams:
+  global:
+    max_timeout: 60
+    num_parties: 5
+    perc_quorum: 0.75
+    rounds: 3
+    termination_accuracy: 0.9
+```
+
+## Maximum Timeout and Rejoin
+Users can specify the maximum timeout (in seconds) aggregator should wait for parties to reply back in the aggregator configuration file. If `max_timeout` value is specified, aggregator will wait for specified amount of time to check if the required number of parties (calculated based on the quorum percentage provided earlier) have replied back or not. Please note that if quorum percentage is not specified aggregator will expect the value to be 100% and expect reply from all the registered parties. Similarly, if maximum timeout is not specified aggregator will wait forever for parties to reply back.
+
+To rejoin party just needs to issue START and REGISTER commands like it did initially to join federated learning process.
diff --git a/examples/constants.py b/examples/constants.py
@@ -20,7 +20,6 @@
 FL_DATASETS = ["default", "mnist", "nursery", "adult", "federated-clustering",
                 "higgs", "airline", "diabetes", "binovf", "multovf", "linovf"]
 FL_EXAMPLES = ["id3_dt", "fedavg", "keras_classifier","pfnm",
-                "sklearn_logclassification", "sklearn_sgdclassifier",
-                "rl_cartpole", "rl_pendulum", "coordinate_median", "krum",
+                "sklearn_logclassification", "rl_cartpole", "rl_pendulum", "coordinate_median", "krum",
                 "naive_bayes", "keras_gradient_aggregation", "spahm", "zeno"]
 FL_CONN_TYPES = ["flask"]
diff --git a/examples/sklearn_logclassification/README.md b/examples/sklearn_logclassification/README.md
@@ -1,24 +1,37 @@
 
-# Running Scikitlearn Logistic Regression Classifier on Adult Dataset in IBM federated learning
+# Running Scikitlearn Logistic Classifier in IBM federated learning
+
+Currently, for logistic classifier we support the following datasets:
+
+* [Adult Dataset](https://archive.ics.uci.edu/ml/datasets/Adult)
+* [MNIST](http://yann.lecun.com/exdb/mnist/)
 
-This example explains how to run federated learning on a Logistic Regression Classifier, implemented with Scikit-Learn
-training on [Adult Dataset](https://archive.ics.uci.edu/ml/datasets/Adult).
 
 The following preprocessing was performed in `AdultSklearnDataHandler` on the original dataset:
   * Drop following features: `workclass`, `fnlwgt`, `education`, `marital-status`, `occupation`, `relationship`, `capital-gain`, `capital-loss`, `hours-per-week`, `native-country`
   * Map `race`, `sex` and `class` values to 0/1
   * Split `age` and `education` columns into multiple columns based on value
 
+  Further details in documentation of `preprocess()` in `AdultSklearnDataHandler`.
+
+No other preprocessing is performed.
+
+The following preprocessing was performed on the MNIST dataset:
+
+* Data is scaled down to range from `[0, 255]` to `[0, 1]`
+* Images are reshaped from`[28, 28]` to `[1,784]`
+
+
 No other preprocessing is performed.
 
 - Split data by running:
 
     ```
-    python examples/generate_data.py -n <num_parties> -d adult -pp <points_per_party>
+    python examples/generate_data.py -n <num_parties> -d <dataset_name> -pp <points_per_party>
     ```
 - Generate config files by running:
     ```
-    python examples/generate_configs.py -n <num_parties> -m sklearn_logclassification -d adult -p <path>
+    python examples/generate_configs.py -n <num_parties> -m sklearn_logclassification -d <dataset_name> -p <path>
     ```
 - In a terminal running an activated IBM FL environment 
 (refer to Quickstart in our website to learn more about how to set up the running environment), start the aggregator by running:

diff --git a/examples/sklearn_logclassification/generate_configs.py b/examples/sklearn_logclassification/generate_configs.py
@@ -1,5 +1,6 @@
 import os
 import pickle
+import numpy as np
 from sklearn.linear_model import SGDClassifier
 
 import examples.datahandlers as datahandlers
@@ -39,10 +40,12 @@ def get_hyperparams():
 
 def get_data_handler_config(party_id, dataset, folder_data, is_agg=False):
 
-    SUPPORTED_DATASETS = ['adult']
+    SUPPORTED_DATASETS = ['adult', 'mnist']
     if dataset in SUPPORTED_DATASETS:
         if dataset == 'adult':
             dataset = 'adult_sklearn'
+        elif dataset == 'mnist':
+            dataset = 'mnist_sklearn'
         data = datahandlers.get_datahandler_config(
             dataset, folder_data, party_id, is_agg)
     else:
@@ -57,6 +60,11 @@ def get_model_config(folder_configs, dataset, is_agg=False, party_id=0):
 
     model = SGDClassifier(loss='log', penalty='l2')
 
+    if dataset == 'adult':
+        model.classes_ = np.array([0, 1])
+    elif dataset == 'mnist':
+        model.classes_ = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
+
     if not os.path.exists(folder_configs):
         os.makedirs(folder_configs)
 

diff --git a/examples/sklearn_sgdclassifier/README.md b/examples/sklearn_sgdclassifier/README.md
diff --git a/examples/sklearn_sgdclassifier/generate_configs.py b/examples/sklearn_sgdclassifier/generate_configs.py
diff --git a/federated-learning-lib/federated_learning_lib-1.0.1-py3-none-any.whl b/federated-learning-lib/federated_learning_lib-1.0.1-py3-none-any.whl
diff --git a/federated-learning-lib/federated_learning_lib-1.0.2-py3-none-any.whl b/federated-learning-lib/federated_learning_lib-1.0.2-py3-none-any.whl
diff --git a/log_config.yaml b/log_config.yaml
@@ -1,20 +1,22 @@
 version: 1
 disable_existing_loggers: False
 formatters:
-    ffl_std:
-        format: "%(asctime)s -STD %(name)s - %(levelname)s - %(message)s"
+    fl_std:
+        format: "%(asctime)s | %(version)s | %(levelname)s | %(name)-45s | %(message)s"
 
 handlers:
     console:
         class: logging.StreamHandler
         level: DEBUG
-        formatter: ffl_std
+        filters: ['version_filter']
+        formatter: fl_std
         stream: ext://sys.stdout
 
     info_file_handler:
         class: logging.handlers.RotatingFileHandler
         level: INFO
-        formatter: ffl_std
+        filters: ['version_filter']
+        formatter: fl_std
         filename: info.log
         maxBytes: 10485760
         backupCount: 10
@@ -23,7 +25,8 @@ handlers:
     error_file_handler:
         class: logging.handlers.RotatingFileHandler
         level: ERROR
-        formatter: ffl_std
+        filters: ['version_filter']
+        formatter: fl_std
         filename: errors.log
         maxBytes: 10485760 # 10MB
         backupCount: 10
@@ -37,4 +40,4 @@ loggers:
 
 root:
     level: INFO
-    handlers: [console, info_file_handler, error_file_handler]
+    handlers: [console, info_file_handler, error_file_handler]
Original file line number	Diff line number	Diff line change
Expand Up		@@ -8,3 +8,7 @@

		- Learn how to create your own data handler in [Create a customized data handler](../tutorials/create_my_data_handler.md).

		- Learn how to load large datasets via data generators in [Set up data generators](../tutorials/set_up_data_generators_for_fl.md).

		- Learn how to specify quorum, maximum timeout for each round, and rejoin party after a dropout [Quorum handling and ability to Rejoin](../tutorials/quorum_rejoin.md).