FIX: New bloom changes breaking prompt learning #1969

BenjaminBossan · 2024-07-29T14:52:51Z

Bloom had two dimensions of the attention layer transposed (compared to all other transformers models), which was fixed by:

huggingface/transformers#31445

Therefore, for future transformers versions, skip the special handling in PEFT.

There is also an issue that prompt injection did not take place when past_key_values was a Cache object that is empty. This should now hopefully work as expected.

For the time being, past_key_values is still being passed as a tuple of tensors, not sure if eventually this must be a Cache object. If yes, this will need fixing in a future PR. Same goes for the usage of get_seq_length() on Cache objects.

Note:
I tested this locally with transformers installed form main and bloom tests passed. PEFT CI only tests this nightly, not on PR.

Bloom had two dimensions of the attention layer transposed (compared to all other transformers models), which was fixed by: huggingface/transformers#31445 Therefore, for future transformers versions, skip the special handling in PEFT. There is also an issue that prompt injection did not take place when past_key_values was a Cache object that is empty. This should now hopefully work as expected. Note: This needs a review from a transformers expert.

HuggingFaceDocBuilderDev · 2024-07-29T14:56:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2024-07-29T16:12:50Z

Would one of you @ArthurZucker or @gante be so kind to review this PR? Note that the code here is just unindented, so the actual diff is smaller than it seems.

gante

LGTM 👍

gante · 2024-07-29T16:26:14Z

@BenjaminBossan

not sure if eventually this must be a Cache object.

I don't think it will, for the foreseeable future. We're out of bandwidth to convert all models 💔 A note: the cache class has a to_legacy_cache() method, PEFT can convert the Cache objects to the tuple format if needed

See huggingface#1969 The fix to the bloom architecture was not actually released in transformers 4.43.3, which makes the version check invalid. Instead, now checking an attribute on the BloomPreTrainedModel.

BenjaminBossan added 2 commits July 29, 2024 16:46

Remove unnecessary code

fb31f84

gante approved these changes Jul 29, 2024

View reviewed changes

BenjaminBossan merged commit 27833a2 into huggingface:main Jul 29, 2024
14 checks passed

BenjaminBossan deleted the fix-prompt-learning-changes-bloom branch July 29, 2024 16:25

BenjaminBossan mentioned this pull request Aug 6, 2024

FIX: Adjust transformers version check for bloom #1992

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX: New bloom changes breaking prompt learning #1969

FIX: New bloom changes breaking prompt learning #1969

BenjaminBossan commented Jul 29, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Jul 29, 2024

BenjaminBossan commented Jul 29, 2024

gante left a comment

gante commented Jul 29, 2024 •

edited

Loading

FIX: New bloom changes breaking prompt learning #1969

FIX: New bloom changes breaking prompt learning #1969

Conversation

BenjaminBossan commented Jul 29, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Jul 29, 2024

BenjaminBossan commented Jul 29, 2024

gante left a comment

Choose a reason for hiding this comment

gante commented Jul 29, 2024 • edited Loading

BenjaminBossan commented Jul 29, 2024 •

edited

Loading

gante commented Jul 29, 2024 •

edited

Loading