-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant difference in size in lora artifacts between different versions of trl and peft #1287
Comments
To add some more context I see that in point 2 above upon passing |
@debraj135 hmm that's strange: I ran: from transformers import AutoModelForCausalLM
from peft import LoraConfig, get_peft_model
peft_config = LoraConfig(
r=8
)
model = AutoModelForCausalLM.from_pretrained("facebook/opt-125m")
peft_model = get_peft_model(model, peft_config)
peft_model.save_pretrained("./test-save") and I can confirm the saved folder has ~1.2 MB which is expected. Can you try with trl '0.7.3.dev0' and peft==0.8.2 to confirm the bug is from TRL? |
I also ran: from peft import LoraConfig
from trl import SFTTrainer
from datasets import load_dataset
peft_config = LoraConfig(
r=8
)
dataset = load_dataset("imdb", split="train")
trainer = SFTTrainer(
"facebook/opt-350m",
train_dataset=dataset,
dataset_text_field="text",
max_seq_length=512,
peft_config=peft_config
)
trainer.save_model("test-save") on TRL main + peft latest (0.8.2) and still not observing strange behaviour (saved model is ~6MB which should be expected for a 350m model) |
Thank you for your response @younesbelkada Here is my skeleton code for saving the adapters
with peft==0.5.0 the adapter_model.bin is 78MB Please let me know if you also see the same. Seems to be a peft issue? Do you have any thoughts about why this seems to be occurring? |
If you want me to try all the releases between peft 0.5.0 and 0.8.1, please let me know. I'm not familiar with the changes over time and if the above behavior is unexpected. Update: Feb 1st Thursday It appears that this bump appears in going from 0.6.2 to 0.7.0 |
@younesbelkada please do let me know what you think about my observation. |
@debraj135 it is definitely a peft issue, I suspect we're hitting a corner case with the recent refactor of peft layers since we now store base layers as well. Do you observe the same behaviour if you remove |
I wonder if this is related to the recent addition of saving the embedding weights automatically when we detect that they're being trained. Those can be quite big. If possible @debraj135, could you check the |
@younesbelkada @BenjaminBossan I will report back later today once I try what both of you have requested. @BenjaminBossan can you help me understand the impact of :
Does this mean that it is not correct to include |
Here is what I see
With
With
I think my questions still remains unanswered after these observations:
|
One of my key concerns here is that with |
Hello @debraj135, Now, the default behaviour in PEFT is to save the embedding layers of the base model when they are part of target modules or if the embedding layers have been resized. This is because the most common scenario when embedding layers are resized or targetted happens when new special tokens are added to the vocab. Now, these embedding layers will have the new token weights initialized randomly and the LoRA weights are tuned wrt the specific initialization. Hence, it is important to save the specific initialization of the embedding layers else during inference the resized embedding layers can be initialized with different random weights and the LoRA weights will no longer be in sync and will lead to undefined behaviour. If you don't want to save the embedding layers, then pass |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
For the same peft config and same based model previously my saved lora adapter files seemed to have changed in size by orders of magnitude:
Previously: 79 MB
adapter_model.bin
trl '0.7.3.dev0'
peft '0.5.0'
Currently: 1GB an adapter safetensors file
trl '0.7.10'
peft '0.7.1
Can anyone please help me understand the difference in behavior?
The text was updated successfully, but these errors were encountered: