Knowledge distillation based federated learning for partially labeled datasets

This repository includes inplementation of the paper "Federated Learning with Knowledge Distillation for Multi-Organ Segmentation with Partially Labeled Datasets" published in Medical Image Analysis. This code implementation is based on DoDNet and LG-FedAVG.

MOTS Dataset Preparation

Before starting, MOTS should be re-built from the serveral medical organ and tumor segmentation datasets

Partial-label task	Data source
Liver	data
Kidney	data
Hepatic Vessel	data
Pancreas	data
Colon	data
Lung	data
Spleen	data

Download and put these datasets in dataset/0123456/.
Re-spacing the data by python re_spacing.py, the re-spaced data will be saved in 0123456_spacing_same/. If there are some issues with this code, try python re_spacing_modify.py.

The folder structure of dataset should be like

dataset/0123456_spacing_same/
├── 0Liver
|    └── imagesTr
|        ├── liver_0.nii.gz
|        ├── liver_1.nii.gz
|        ├── ...
|    └── labelsTr
|        ├── liver_0.nii.gz
|        ├── liver_1.nii.gz
|        ├── ...
├── 1Kidney
├── ...

Voxel spacing information is missing in some CT volumes. Exclude these examples in trainig and testing, i.e., modify MOTS_train.txt and MOTS_test.txt file. Now you should use MOTS_train_modify.txt and MOTS_test_modify.txt file. Here is the exception list with different array size between image and label. [file path, image shape, label shape]

['./0Liver/imagesTr/liver_48.nii.gz', ]
['./0Liver/imagesTr/liver_49.nii.gz', (611, 611, 423), (640, 640, 169)]
['./0Liver/imagesTr/liver_50.nii.gz', (580, 580, 400), (640, 640, 160)]
['./0Liver/imagesTr/liver_51.nii.gz', (577, 577, 454), (640, 640, 151)]
['./0Liver/imagesTr/liver_52.nii.gz', (535, 535, 395), (640, 640, 158)]

These cases looks fine.But it matters in evaluation because the shape of them must be same.

['./0Liver/imagesTr/liver_85.nii.gz', (412, 412, 294), (412, 412, 293)]
Hepatic vessels have just 1 voxel difference. maybe fine to use it.
['./2HepaticVessel/imagesTr/hepaticvessel_178.nii.gz', (484, 484, 143), (485, 485, 143)]
['./2HepaticVessel/imagesTr/hepaticvessel_221.nii.gz', (624, 624, 123), (625, 625, 123)]

here is the data list which miss spacing information.

./0123456/0Liver/imagesTr/volume-28.nii
./0123456/0Liver/imagesTr/volume-29.nii
./0123456/0Liver/imagesTr/volume-30.nii
./0123456/0Liver/imagesTr/volume-31.nii
./0123456/0Liver/imagesTr/volume-32.nii
./0123456/0Liver/imagesTr/volume-33.nii
./0123456/0Liver/imagesTr/volume-34.nii
./0123456/0Liver/imagesTr/volume-35.nii
./0123456/0Liver/imagesTr/volume-36.nii
./0123456/0Liver/imagesTr/volume-37.nii
./0123456/0Liver/imagesTr/volume-38.nii
./0123456/0Liver/imagesTr/volume-39.nii
./0123456/0Liver/imagesTr/volume-40.nii
./0123456/0Liver/imagesTr/volume-41.nii
./0123456/0Liver/imagesTr/volume-42.nii
./0123456/0Liver/imagesTr/volume-43.nii
./0123456/0Liver/imagesTr/volume-44.nii
./0123456/0Liver/imagesTr/volume-45.nii
./0123456/0Liver/imagesTr/volume-46.nii
./0123456/0Liver/imagesTr/volume-47.nii

MOTS_train.txt : original textfile
MOTS_train_modify.txt : Kidney resampling part is modified and the cases which have different image and label shape were removed.
MOTS_train_modify_v2.txt : Cases which miss spacing information were excluded.

Training

federated averaging (FedAVG) KL divergence loss is not working with binary cross entropy loss. It works with softmax predictions. Implementation is not available in this case.

FL=fedavg or flkd
ARCH=multihead or sep_enc or sep_dec or dynconv
EPOCH=1000
ITER=80

time CUDA_VISIBLE_DEVICES=${GPU} python train.py \
  --train_list="list/MOTS/MOTS_train_modify_v2.txt" \
  --snapshot_dir="snapshots/${ARCH}_${FL}_${EPOCH}epoch_${ITER}iter" \
  --input_size="64,128,128" \
  --batch_size=2 \
  --num_gpus=1 \
  --num_epochs=${EPOCH} \
  --start_epoch=0 \
  --learning_rate=1e-2 \
  --num_classes=2 \
  --num_workers=8 \
  --weight_std=True \
  --random_mirror=True \
  --random_scale=True \
  --FP16=False \
  --arch=${ARCH} \
  --local_min_update=${ITER} \
  --lr_decay=True \
  --model=${FL}

check tensorboard

tensorboard --bind_all --logdir=./snapshots --load_fast=false

Evaluation

You can change input size bigger like '64,256,256'.

time CUDA_VISIBLE_DEVICES=${GPU} python evaluate.py \
--val_list="list/MOTS/MOTS_test_modify_v2.txt" \
--reload_from_checkpoint=True \
--reload_path=[PATH_TO_CKPT] \
--ensemble=False \
--save_path="./outputs/${ARCH}_${FL}_${EPOCH}epoch_${ITER}/" \
--input_size="64,128,128" \
--batch_size=1 \
--num_gpus=1 \
--num_workers=2 \
--arch=${ARCH} \
--FP16=False

Post-processing

time python ../postp.py --img_folder_path="./outputs/${ARCH}_${FL}_${EPOCH}epoch_${ITER}iter/" | tee -a "./outputs/${ARCH}_${FL}_${EPOCH}epoch_${ITER}iter.txt"

Citation

If this code is helpful for your study, please cite:

@article{kim2024federated,
  title={Federated learning with knowledge distillation for multi-organ segmentation with partially labeled datasets},
  author={Kim, Soopil and Park, Heejung and Kang, Myeongkyun and Jin, Kyong Hwan and Adeli, Ehsan and Pohl, Kilian M and Park, Sang Hyun},
  journal={Medical Image Analysis},
  volume={95},
  pages={103156},
  year={2024},
  publisher={Elsevier}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
dataset		dataset
fl		fl
loss_functions		loss_functions
train_local_model		train_local_model
utils		utils
README.md		README.md
engine.py		engine.py
postp.py		postp.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Knowledge distillation based federated learning for partially labeled datasets

MOTS Dataset Preparation

Training

Evaluation

Post-processing

Citation

About

Releases

Packages

Languages

oopil/FLKDOrganSeg

Folders and files

Latest commit

History

Repository files navigation

Knowledge distillation based federated learning for partially labeled datasets

MOTS Dataset Preparation

Training

Evaluation

Post-processing

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages