GitHub

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data pdf

Directory Structure

.
└── root/
    ├── data/
    │   ├── hateful_memes/
    │   │   ├── minicpmv_data/
    │   │   │   ├── modality-missing/
    │   │   │   │   ├── mrate-0.3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── mrate-0.4/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── mrate-0.5/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── modality-single/
    │   │   │   │   ├── image-3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── image-5/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── image-7/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── modality-mix/
    │   │   │   │   ├── qrate-0.2/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   ├── qrate-0.3/
    │   │   │   │   │   └── partition-alpha0.5-clt10
    │   │   │   │   └── qrate-0.4/
    │   │   │   │       └── partition-alpha0.5-clt10
    │   │   │   ├── partition-alpha5.0-clt10
    │   │   │   ├── partition-alpha1.0-clt10
    │   │   │   └── partition-alpha0.5-clt10
    │   │   └── raw_data/ # Extracted files of the downloaded dataset
    │   │       ├── partition-alpha5.0-clt10
    │   │       ├── partition-alpha1.0-clt10
    │   │       └── partition-alpha0.5-clt10
    │   └── crisis-mmd # Consistent with the *hateful_memes* folder structure.
    └── code/
        ├── data_gen/
        │   ├── data_partition_crisismmd.py
        │   ├── data_partition_hateful.py
        │   ├── gen_data_crisismmd_missing_aug.py
        │   ├── gen_data_crisismmd_missing.py
        │   ├── gen_data_crisismmd_mix_aug.py
        │   ├── gen_data_crisismmd_mix.py
        │   ├── gen_data_crisismmd_single_aug.py
        │   ├── gen_data_crisismmd_single.py
        │   ├── gen_data_crisismmd.py
        │   ├── gen_data_hateful_missing_aug.py
        │   ├── gen_data_hateful_missing.py
        │   ├── gen_data_hateful_mix_aug.py
        │   ├── gen_data_hateful_mix.py
        │   ├── gen_data_hateful_single_aug.py
        │   ├── gen_data_hateful_single.py
        │   └── gen_data_hateful.py
        ├── finetune/
        │   ├── federated_learning/
        │   │   ├── __init__.py
        │   │   ├── fed_global.py
        │   │   └── fed_utils.py
        │   ├── __init__.py
        │   ├── dataset.py
        │   ├── finetune_lora.sh
        │   ├── finetune.py
        │   └── trainer.py
        ├── eval_crisismmd_aug.py
        ├── eval_crisismmd.py
        ├── eval_hateful_aug.py
        └── eval_hateful.py

Install

conda create -n FedMLLM python=3.10 -y
pip install -r requirements.txt
pip install deepspeed
pip3 install -U scikit-learn
pip install peft
pip install flash_attn
pip install bitsandbytes
pip install tensorboardX

Dataset

Download dataset

Hateful-Memes download
CrisisMMD download

Dataset processing

python data_partition_crisismmd.py
python gen_data_crisismmd.py # aligned modal scenario
python gen_data_crisismmd_missing.py # missing modal scenario
python gen_data_crisismmd_missing_aug.py # missing modal scenario with prompt strategy
python gen_data_crisismmd_single.py # cross modal scenario
python gen_data_crisismmd_mix.py # hybrid modal scenario

Training

cd finetune/
sh finetune_lora.sh

Testing

python eval_crisismmd.py
python eval_hateful.py

Citation

@article{xu2024fedmllm,
  title={FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data},
  author={Xu, Binqian and Shu, Xiangbo and Mei, Haiyang and Xie, Guosen and Fernando, Basura and Shou, Mike Zheng and Tang, Jinhui},
  journal={arXiv preprint arXiv:2411.14717},
  year={2024}
}

Acknowledgements

This repo is based on MiniCPM-V and OpenFedLLM, thanks to the original authors for their works!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data pdf

Directory Structure

Install

Dataset

Download dataset

Dataset processing

Training

Testing

Citation

Acknowledgements

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
assets		assets
data_gen		data_gen
finetune		finetune
README.md		README.md
eval_crisismmd.py		eval_crisismmd.py
eval_crisismmd_aug.py		eval_crisismmd_aug.py
eval_hateful.py		eval_hateful.py
eval_hateful_aug.py		eval_hateful_aug.py
requirements.txt		requirements.txt

1xbq1/FedMLLM

Folders and files

Latest commit

History

Repository files navigation

FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data pdf

Directory Structure

Install

Dataset

Download dataset

Dataset processing

Training

Testing

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages