Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Suggestions] Updates suggested for helper.rescale_adapter_scale #1989

Merged
merged 6 commits into from
Aug 7, 2024

Conversation

ariG23498
Copy link
Contributor

Linked: #1940

CC: @BenjaminBossan

@BenjaminBossan
Copy link
Member

Thanks for working on this PR so quickly.

Before going into details, I wanted to discuss something else that came to my mind, namely the argument name alpha. I wonder if we should rename it. Right now, the scaling factor is calculated as config.lora_alpha / config.r. I'm afraid that users could think that if they call, say, rescale_adapter_scale(model, alpha=config.r), they would get a scaling of 1, as they might think that this alpha actually sets the lora_alpha. I'm not sure what a better name would be, as something like scale could also be confusing. WDYT? Also pinging @SimJeg.

@ariG23498
Copy link
Contributor Author

Would something like scale_multiplier make sense? Essentially we do the exact operation.

@SimJeg
Copy link

SimJeg commented Aug 5, 2024

I agree using alpha is confusing. I proposed this based on the wise-ft paper but it will be confusing for users. I like scale_multiplier. @BenjaminBossan is there any other similar variable in PEFT to have something harmonized ?

@BenjaminBossan
Copy link
Member

We have this in PEFT:

def set_scale(self, adapter, scale):

But as mentioned, I think "scale" is ambiguous, as it could be interpreted as the absolute value. However, I do like scale_multiplier, or even just multiplier, as I think that leaves less room for misinterpretation.

@ariG23498
Copy link
Contributor Author

Renamed the function and the parameter.

@SimJeg
Copy link

SimJeg commented Aug 6, 2024

@BenjaminBossan should we add some warning if device == "mps" ?

@BenjaminBossan
Copy link
Member

should we add some warning if device == "mps" ?

I'd prefer to add a sentence about this in the docstring until we know better what the issue really is, something along the lines of: "Warning: It has been reported that when using Apple's MPS backend for PyTorch, it is necessary to add a short sleep time after exiting the context before the scales are fully restored".

At the moment, we don't even know if this replicates for other ARM Macs and all PyTorch versions. If it's an extreme edge case, the warning could do more harm than good.

@SimJeg
Copy link

SimJeg commented Aug 6, 2024

Ok, once @ariG23498 add this to the docstring, I have no more comments :)

@BenjaminBossan
Copy link
Member

@ariG23498 Could you please merge with/rebase on the latest main branch? Otherwise, some CI will fail. Sorry for the inconvenience.

@ariG23498
Copy link
Contributor Author

Apologies for the delay! I have rebased just now.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/peft/helpers.py Outdated Show resolved Hide resolved
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the updates @ariG23498 and thanks @SimJeg for your valuable feedback.

@BenjaminBossan BenjaminBossan merged commit 2a5d313 into huggingface:main Aug 7, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants