Collaboratively Learning Federated Models from Noisy Decentralized Data

Haoyuan Li¹ Mathias Funk¹ Nezihe Merve Gürel² Aaqib Saeed¹

¹ Decentralized Artificial Intelligence Research Lab, Eindhoven University of Technology

²Delft University of Technology

📢 Updates

[11/2024] Released the project page link.

[10/2024] Code has been released.

[10/2024] Accepted to IEEE BigData 2024

[09/2024] arXiv paper has been released.

📝 Abstract

Federated learning (FL) has emerged as a prominent method for collaboratively training machine learning models using local data from edge devices, all while keeping data decentralized. However, accounting for the quality of data contributed by local clients remains a critical challenge in FL, as local data are often susceptible to corruption by various forms of noise and perturbations, which compromise the aggregation process and lead to a subpar global model. In this work, we focus on addressing the problem of noisy data in the input space, an under-explored area compared to the label noise. We propose a comprehensive assessment of client input in the gradient space, inspired by the distinct disparity observed between the density of gradient norm distributions of models trained on noisy and clean input data. Based on this observation, we introduce a straightforward yet effective approach to identify clients with low-quality data at the initial stage of FL. Furthermore, we propose a noise-aware FL aggregation method, namely Federated Noise-Sifting (FedNS), which can be used as a plug-in approach in conjunction with widely used FL strategies. Our extensive evaluation on diverse benchmark datasets under different federated settings demonstrates the efficacy of FedNS. Our method effortlessly integrates with existing FL strategies, enhancing the global model’s performance by up to 13.68% in IID and 15.85% in non-IID settings when learning from noisy decentralized data.

⚓ Overview ⚓

📚 Three Pillars of FedNS

🔍 Noise Identification: FedNS identifies noisy clients in the first training round (one-shot).
🛡️ Resilient Aggregation: A resilient strategy that minimizes the impact of noisy clients, ensuring robust model performance.
🔒 Data Confidentiality: Shares only scalar gradient norms to keep data confidential.

🔧 Requirements

Environment

Dataset

We provide the noisy dataset creation on various benchmarks, we shown an example of generating noisy CIFAR10:

CIFAR10:

cd ./data/cifar10data
python create_cifar10_noisy.py

For benchmark on human annoation errors, you can refer to CIFAR10/100N. For decentralized data generation, please go to folder .\src_fed.

💡 Running scripts

To prepare your experiment, please setup your configuration at the main.py. You can configure the specific federated learning strategy at server.py. You can simply execute the main script them to run the experiment, the results will save as a logs file.

cd ./scr_fed/cifar10
python main.py

💭 Correspondence

If you have any questions, please contact me via email or open an issue.

Citing FedNS

The code repository for "Collaboratively Learning Federated Models from Noisy Decentralized Data" (IEEE BigData 2024) in PyTorch. If you use any content of this repo for your work, please cite the following bib entry:

@article{li2024collaboratively,
  title={Collaboratively Learning Federated Models from Noisy Decentralized Data},
  author={Li, Haoyuan and Funk, Mathias and G{\"u}rel, Nezihe Merve and Saeed, Aaqib},
  journal={arXiv preprint arXiv:2409.02189},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
img		img
src_fed		src_fed
README.md		README.md
__init__.py		__init__.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collaboratively Learning Federated Models from Noisy Decentralized Data

📢 Updates

📝 Abstract

⚓ Overview ⚓

📚 Three Pillars of FedNS

🔧 Requirements

Environment

Dataset

💡 Running scripts

💭 Correspondence

Citing FedNS

About

Releases

Packages

Contributors 2

Languages

Decentralized-AI-Reserach-Lab/FedNS

Folders and files

Latest commit

History

Repository files navigation

Collaboratively Learning Federated Models from Noisy Decentralized Data

📢 Updates

📝 Abstract

⚓ Overview ⚓

📚 Three Pillars of FedNS

🔧 Requirements

Environment

Dataset

💡 Running scripts

💭 Correspondence

Citing FedNS

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages