New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[docs] Model merging #1423

Merged

stevhliu merged 6 commits into huggingface:main from stevhliu:model-merging

Feb 15, 2024

Member

stevhliu commented Jan 31, 2024 •

edited

Loading

A guide to new model merging methods introduced in #1364.

todo:

add API reference for merging utilities (once the other PR is merged, I'll rerun the build_pr_documentation test and it should pass)
test and run code examples

HuggingFaceDocBuilderDev commented Jan 31, 2024

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

stevhliu commented

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

pacman100 mentioned this pull request

Add new merging methods #1364

Merged

3 tasks

prateeky2806 reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Show resolved Hide resolved

prateeky2806 reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Show resolved Hide resolved

Contributor

prateeky2806 commented Feb 12, 2024

Hi, is there an estimated timeline about by when this would be merged ?

stevhliu force-pushed the model-merging branch from cb44c70 to 9a84309 Compare

February 12, 2024 16:49

stevhliu mentioned this pull request

[docs] Docstring typo #1455

Merged

Member Author

stevhliu commented Feb 12, 2024

Hi, is there an estimated timeline about by when this would be merged ?

Hi, it should be ready in the next few days and at the end of the week by the latest if there are no major issues!

stevhliu added 4 commits

February 12, 2024 09:56


          content

0d0eff4


          code snippets

4faa9a0


          api reference

b8badee


          update

bde2219

stevhliu force-pushed the model-merging branch from 9a84309 to bde2219 Compare

February 12, 2024 17:57

stevhliu requested review from BenjaminBossan, pacman100 and prateeky2806

February 12, 2024 18:22

stevhliu marked this pull request as ready for review

February 12, 2024 18:23

prateeky2806 reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

docs/source/developer_guides/model_merging.md

+              adapters = ["norobots", "adcopy", "sql"]
+              weights = [2.0, 0.3, 0.7]
+              adapter_name = "merge"
+              density = 0.2

Contributor

prateeky2806 Feb 13, 2024

I am not sure how this density parameter works in dare_ties, I am assuming it is used to keep 20% params and then rescale as in DARE. However, I am not sure if we do TIES on top of it then will it again only keep the 20% params of the pruned and rescaled checkpoint essentially leading to 0.2*0.2 *100 = 4% remaining parameter or if it will keep 20% of the parameters. This is not a comment on the documentation but this behaviour is not very clear.

Contributor

pacman100 Feb 14, 2024

Hello, in dare_ties, first random pruning happens based on density followed by scaling. After this, the majority_sign_mask and disjoint_merge are performed similar to the ties method. So, pruning is taken from dare which is random and rescaled followed by majority sign and disjoint merge from ties.

prateeky2806 reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved


          feedback

b5ed8c8

prateeky2806 approved these changes

View reviewed changes

sayakpaul reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

sayakpaul reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

sayakpaul reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

sayakpaul reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated

Comment on lines 49 to 55

+              config = PeftConfig.from_pretrained("smangrul/tinyllama_lora_norobots")
+              model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, load_in_4bit=True, device_map="auto")
+              tokenizer = AutoTokenizer.from_pretrained("smangrul/tinyllama_lora_norobots")
+              model = PeftModel.from_pretrained(model, "smangrul/tinyllama_lora_norobots", adapter_name="norobots")
+              _ = model.load_adapter("smangrul/tinyllama_lora_sql", adapter_name="sql")
+              _ = model.load_adapter("smangrul/tinyllama_lora_adcopy", adapter_name="adcopy")

Member

sayakpaul Feb 14, 2024

@pacman100 @BenjaminBossan makes sense to have copies of these checkpoints under the PEFT testing org?

Member

BenjaminBossan Feb 14, 2024

Yes, let's try to put those artifacts on https://huggingface.co/peft-internal-testing.

sayakpaul reviewed

View reviewed changes

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

sayakpaul reviewed

View reviewed changes

docs/source/package_reference/merge_utils.md Outdated Show resolved Hide resolved

sayakpaul approved these changes

View reviewed changes

Member

sayakpaul left a comment

Looking solid!

BenjaminBossan approved these changes

View reviewed changes

Member

BenjaminBossan left a comment

Nicely done. On top of what has already been mentioned, I just have one comment about there actually being more than 2 methods. Otherwise, this LGTM.

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved

docs/source/developer_guides/model_merging.md Outdated Show resolved Hide resolved


          feedback

a4f772a

BenjaminBossan mentioned this pull request

add magnitude_prune merging method #1466

Merged

stevhliu merged commit cde8f1a into huggingface:main

14 checks passed

stevhliu deleted the model-merging branch

February 15, 2024 16:13

BenjaminBossan pushed a commit to BenjaminBossan/peft that referenced this pull request


          [docs] Model merging (huggingface#1423)

66691d5

* content

* code snippets

* api reference

* update

* feedback

* feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet