Skip to content

Commit

Permalink
Feature/SK-971 | New object detection example (#703)
Browse files Browse the repository at this point in the history
  • Loading branch information
KatHellg authored Oct 11, 2024
1 parent a7ef5b2 commit 211ce62
Show file tree
Hide file tree
Showing 16 changed files with 488 additions and 3 deletions.
3 changes: 0 additions & 3 deletions examples/README.md

This file was deleted.

123 changes: 123 additions & 0 deletions examples/welding-defect-detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@

**Note:**

**One of the dependencies in this example has an APGL license. This dependy is used in this particular example and not in FEDn in general.**

**If you are new to FEDn, we recommend that you start with the MNIST-Pytorch example instead: https://github.com/scaleoutsystems/fedn/tree/master/examples/mnist-pytorch**

# Welding Defect Object Detection Example

This is an example FEDn project that trains a YOLOv8n model on images of welds to classify them as "good", "bad", or "defected". The dataset is pre-labeled and can be accessed for free from Kaggle https://www.kaggle.com/datasets/sukmaadhiwijaya/welding-defect-object-detection. See a few examples below,

<img src="figs/fig1.jpg" width=30% height=30%>

<img src="figs/fig2.jpg" width=30% height=30%>

<img src="figs/fig3.jpg" width=30% height=30%>


This example is generalizable to many manufacturing and operations use cases, such as automatic optical inspection. The federated setup enables the organization to make use of available data in different factories and in different parts of the manufacturing process, without having to centralize the data.


## How to run the example

To run the example, follow the steps below. For a more detailed explanation, follow the Quickstart Tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html

**Note: To be able to run this example, you need to have GPU access.**


### 1. Prerequisites

- `Python >=3.8, <=3.12 <https://www.python.org/downloads>`__
- `A project in FEDn Studio <https://fedn.scaleoutsystems.com/signup>`__
- `A Kaggle account <https://www.kaggle.com/account/login?phase=startSignInTab&returnUrl=%2Fsignup>`__
- GPU access


### 2. Install FEDn and clone GitHub repo

Install fedn:

```
pip install fedn
```

Clone this repository, then locate into this directory:

```
git clone https://github.com/scaleoutsystems/fedn.git
cd fedn/examples/welding-defect-detection
```


### 3. Creating the compute package and seed model

Create the compute package:

```
fedn package create --path client
```

This creates a file 'package.tgz' in the project folder.

Next, generate the seed model:

```
fedn run build --path client
```

This will create a model file 'seed.npz' in the root of the project. This step will take a few minutes, depending on hardware and internet connection (builds a virtualenv).

### 4. Running the project on FEDn

To learn how to set up your FEDn Studio project and connect clients, take the quickstart tutorial: https://fedn.readthedocs.io/en/stable/quickstart.html. When activating the first client, you will be asked to provide your login credentials to Kaggle to download the welding defect dataset and split it into separate client folders.


## Experiments with results

Below are a few examples of experiments which have been run using this example. A centralized setup has been used as baseline to compare against. Two clients have been used in the federated setup and a few different epoch-to-round ratios have been tested.


### Experimental setup

Aggregator:
- FedAvg

Hyperparameters:
- batch size: 16
- learning rate: 0.01
- imgsz: 640

Approach: The number of epochs and rounds in each experiment are divided such that rounds * epochs = 250.

#### Centralized setup

| Experiment ID| # clients | epochs | rounds |
| ----------- | ---------- | -------- | ------ |
| 0 | 1 | 250 | 1 |

#### Federated setup

| Experiment ID| # clients | epochs | rounds |
| ----------- | ---------- | -------- | ------ |
| 1 | 2 | 5 | 50 |
| 2 | 2 | 10 | 25 |
| 3 | 2 | 25 | 10 |



### Results

Centralized:

<img src="figs/CentralizedmAP50.png" width=50% height=50%>


Federated:

<img src="figs/2clients_5epochs_50rounds.png" width=50% height=50%>

<img src="figs/2clients_10epochs_25rounds.png" width=50% height=50%>

<img src="figs/2clients_25epochs_10rounds.png" width=50% height=50%>

46 changes: 46 additions & 0 deletions examples/welding-defect-detection/client/custom.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 3 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'
# [depth, width, max_channels]
n: [0.33, 0.25, 1024] # YOLOv8n summary: 225 layers, 3157200 parameters, 3157184 gradients, 8.9 GFLOPs
s: [0.33, 0.50, 1024] # YOLOv8s summary: 225 layers, 11166560 parameters, 11166544 gradients, 28.8 GFLOPs
m: [0.67, 0.75, 768] # YOLOv8m summary: 295 layers, 25902640 parameters, 25902624 gradients, 79.3 GFLOPs
l: [1.00, 1.00, 512] # YOLOv8l summary: 365 layers, 43691520 parameters, 43691504 gradients, 165.7 GFLOPs
x: [1.00, 1.25, 512] # YOLOv8x summary: 365 layers, 68229648 parameters, 68229632 gradients, 258.5 GFLOPs

# YOLOv8.0n backbone
backbone:
# [from, repeats, module, args]
- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
- [-1, 3, C2f, [128, True]]
- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
- [-1, 6, C2f, [256, True]]
- [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
- [-1, 6, C2f, [512, True]]
- [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
- [-1, 3, C2f, [1024, True]]
- [-1, 1, SPPF, [1024, 5]] # 9

# YOLOv8.0n head
head:
- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 6], 1, Concat, [1]] # cat backbone P4
- [-1, 3, C2f, [512]] # 12

- [-1, 1, nn.Upsample, [None, 2, "nearest"]]
- [[-1, 4], 1, Concat, [1]] # cat backbone P3
- [-1, 3, C2f, [256]] # 15 (P3/8-small)

- [-1, 1, Conv, [256, 3, 2]]
- [[-1, 12], 1, Concat, [1]] # cat head P4
- [-1, 3, C2f, [512]] # 18 (P4/16-medium)

- [-1, 1, Conv, [512, 3, 2]]
- [[-1, 9], 1, Concat, [1]] # cat head P5
- [-1, 3, C2f, [1024]] # 21 (P5/32-large)

- [[15, 18, 21], 1, Detect, [nc]] # Detect(P3, P4, P5)
131 changes: 131 additions & 0 deletions examples/welding-defect-detection/client/data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
import os
from math import floor
import opendatasets
import shutil

dir_path = os.path.dirname(os.path.realpath(__file__))
abs_path = os.path.abspath(dir_path)


def load_labels(label_dir):
label_files = os.listdir(label_dir)
data = []
for label_file in label_files:
with open(os.path.join(label_dir, label_file), "r") as file:
lines = file.readlines()
for line in lines:
class_id, x_center, y_center, width, height = map(float, line.strip().split())
data.append([class_id, x_center, y_center, width, height])
return data


def load_data(data_path, step):
if data_path is None:
data_env = os.environ.get("FEDN_DATA_PATH")
if data_env is None:
data_path = f"{abs_path}/data/clients/1"
else:
data_path = f"{abs_path}{data_env}"
if step == "train":
y = os.listdir(f"{data_path}/train/labels")
length = len(y)
elif step == "test":
y = os.listdir(f"{data_path}/test/labels")
length = len(y)
else:
y = os.listdir(f"{data_path}/valid/labels")
length = len(y)

X = f"{data_path}/data.yaml"
return X, length


def move_data_yaml(base_dir, new_path):
old_image_path = os.path.join(base_dir, "data.yaml")
new_image_path = os.path.join(new_path, "data.yaml")
shutil.copy(old_image_path, new_image_path)


def splitset(dataset, parts):
n = len(dataset)
local_n = floor(n / parts)
result = []
for i in range(parts):
result.append(dataset[i * local_n : (i + 1) * local_n])
return result


def build_client_folder(folder, data, idx, subdir):

os.makedirs(f"{subdir}/{folder}/images")
os.makedirs(f"{subdir}/{folder}/labels")
if folder=="train":
x = "x_train"
y = "y_train"
elif folder=="test":
x = "x_test"
y = "y_test"
else:
x = "x_val"
y = "y_val"

for image in data[x][idx]:
old_image_path = os.path.join(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/{folder}/images", image)
new_image_path = os.path.join(f"{subdir}/{folder}/images", image)
shutil.move(old_image_path, new_image_path)
for label in data[y][idx]:
old_image_path = os.path.join(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/{folder}/labels", label)
new_image_path = os.path.join(f"{subdir}/{folder}/labels", label)
shutil.move(old_image_path, new_image_path)

def split(out_dir="data"):
n_splits = int(os.environ.get("FEDN_NUM_DATA_SPLITS", 1))

# Make dir
if not os.path.exists(f"{out_dir}/clients"):
os.makedirs(f"{out_dir}/clients")
opendatasets.download("https://www.kaggle.com/datasets/sukmaadhiwijaya/welding-defect-object-detection")
# Load data and convert to dict
X_train = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/train/images")]
X_test = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/test/images")]
X_val = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/valid/images")]

y_train = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/train/labels")]
y_test = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/test/labels")]
y_val = [f for f in os.listdir(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset/valid/labels")]

data = {
"x_train": splitset(X_train, n_splits),
"y_train": splitset(y_train, n_splits),
"x_test": splitset(X_test, n_splits),
"y_test": splitset(y_test, n_splits),
"x_val": splitset(X_val, n_splits),
"y_val": splitset(y_val, n_splits),
}

# Make splits
folders = ["train", "test", "valid"]
for i in range(n_splits):
subdir = f"{out_dir}/clients/{str(i+1)}"
if not os.path.exists(subdir):
for folder in folders:
build_client_folder(folder, data, i, subdir)
move_data_yaml(f"{abs_path}/welding-defect-object-detection/The Welding Defect Dataset/\
The Welding Defect Dataset", subdir)
# Remove downloaded directory
if os.path.exists(f"{abs_path}/welding-defect-object-detection"):
shutil.rmtree(f"{abs_path}/welding-defect-object-detection")


if __name__ == "__main__":
# Prepare data if not already done
if not os.path.exists(abs_path + "/data/clients/1"):
split()
11 changes: 11 additions & 0 deletions examples/welding-defect-detection/client/fedn.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
python_env: python_env.yaml
entry_points:
build:
command: python model.py
startup:
command: python data.py
train:
command: python train.py
validate:
command: python validate.py

65 changes: 65 additions & 0 deletions examples/welding-defect-detection/client/model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
import collections
from ultralytics import YOLO
import torch

from fedn.utils.helpers.helpers import get_helper

HELPER_MODULE = "numpyhelper"
helper = get_helper(HELPER_MODULE)


def compile_model():
"""Compile the pytorch model.
:return: The compiled model.
:rtype: torch.nn.Module
"""
model = YOLO("custom.yaml")
return model


def save_parameters(model, out_path):
"""Save model paramters to file.
:param model: The model to serialize.
:type model: torch.nn.Module
:param out_path: The path to save to.
:type out_path: str
"""
parameters_np = [val.cpu().numpy() for _, val in model.state_dict().items()]
helper.save(parameters_np, out_path)


def load_parameters(model_path):
"""Load model parameters from file and populate model.
param model_path: The path to load from.
:type model_path: str
:return: The loaded model.
:rtype: torch.nn.Module
"""
model = compile_model()
parameters_np = helper.load(model_path)

params_dict = zip(model.state_dict().keys(), parameters_np)
state_dict = collections.OrderedDict({key: torch.tensor(x) for key, x in params_dict})
model.load_state_dict(state_dict, strict=True)
torch.save(model,"tempfile.pt")
model = YOLO("tempfile.pt")
return model


def init_seed(out_path="seed.npz"):
"""Initialize seed model and save it to file.
:param out_path: The path to save the seed model to.
:type out_path: str
"""
# Init and save
model = compile_model()
save_parameters(model, out_path)


if __name__ == "__main__":
init_seed("../seed.npz")
Loading

0 comments on commit 211ce62

Please sign in to comment.