add mlp bias for llama models #30031

mayank31398 · 2024-04-04T03:11:17Z

This adds bias support for MLP in Llama
Can we add this?
It would help a lot for some models we are developing
@ArthurZucker and @younesbelkada

This lets us re-use the llama model class for our models :) without adding another class for a new model.

mayank31398 · 2024-04-04T18:09:59Z

@younesbelkada light ping again

ArthurZucker

Hey! Though this is not necessarily against our transformers philosophy, we would need to have justification: meaning a new cool model released rather than a promise that it will be release!
This should be easy to have on the hub with trust_remote_code=True no?

ArthurZucker

ALright! Now that you have a new model coming this makes sense. Could you make sure the CIs pass?

younesbelkada

Great work !

src/transformers/models/llama/configuration_llama.py

HuggingFaceDocBuilderDev · 2024-05-03T08:27:55Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

younesbelkada

Great, thanks !

mayank31398 · 2024-05-03T08:57:20Z

Thanks everyone for the quick turnaround 🤗

#### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <[email protected]> Signed-off-by: Joe Runde <[email protected]> Co-authored-by: Joe Runde <[email protected]>

…atahub-io#85) #### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <[email protected]> Signed-off-by: Joe Runde <[email protected]> Co-authored-by: Joe Runde <[email protected]>

* add bias * fix quality

#### Motivation The `Calico` models currently set the mlp and attention bias to true, which was hard-coded to false in flash and paged llama implementations. This will use the config params set in huggingface/transformers#30031 to set those values properly. #### Modifications - added attention_bias, mlp_bias to config for Flash and Paged Llama implementations (default is False) - set bias in attention and mlp to the config value #### Result Models should be able to load properly if containing attention and mlp bias --------- Signed-off-by: Joshua Rosenkranz <[email protected]> Signed-off-by: Joe Runde <[email protected]> Co-authored-by: Joe Runde <[email protected]>

add bias

f28600f

ArthurZucker reviewed Apr 5, 2024

View reviewed changes

mayank31398 added 2 commits May 2, 2024 15:54

Merge branch 'main' into llama-bias

d37f7ee

Merge branch 'main' into llama-bias

0aba976

ArthurZucker approved these changes May 3, 2024

View reviewed changes

younesbelkada approved these changes May 3, 2024

View reviewed changes

src/transformers/models/llama/configuration_llama.py Outdated Show resolved Hide resolved

src/transformers/models/llama/configuration_llama.py Outdated Show resolved Hide resolved

fix quality

a41eca1

mayank31398 force-pushed the llama-bias branch from 978df7f to a41eca1 Compare May 3, 2024 08:44

younesbelkada approved these changes May 3, 2024

View reviewed changes

younesbelkada merged commit 425e1a0 into huggingface:main May 3, 2024
21 checks passed

mayank31398 deleted the llama-bias branch May 3, 2024 09:05

This was referenced May 6, 2024

added attn and mlp bias IBM/text-generation-inference#83

Closed

Added attn and mlp bias IBM/text-generation-inference#84

Closed

added mlp and attn bias option to flash and paged llama models IBM/text-generation-inference#85

Merged

heyselbi mentioned this pull request May 7, 2024

added mlp and attn bias option to flash and paged llama models (#85) red-hat-data-services/text-generation-inference#32

Merged

psyv282j9d mentioned this pull request May 7, 2024

Add Support for IBM Granite ggerganov/llama.cpp#7116

Closed

itazap pushed a commit that referenced this pull request May 14, 2024

add mlp bias for llama models (#30031)

a3a05f3

* add bias * fix quality

Semihal mentioned this pull request May 29, 2024

[New Model]: IBM Granite Code Models vllm-project/vllm#5095

Closed

younesbelkada mentioned this pull request Jun 20, 2024

Granite language models #31502

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add mlp bias for llama models #30031

add mlp bias for llama models #30031

mayank31398 commented Apr 4, 2024 •

edited

Loading

mayank31398 commented Apr 4, 2024

ArthurZucker left a comment

ArthurZucker left a comment

younesbelkada left a comment

HuggingFaceDocBuilderDev commented May 3, 2024

younesbelkada left a comment

mayank31398 commented May 3, 2024

add mlp bias for llama models #30031

add mlp bias for llama models #30031

Conversation

mayank31398 commented Apr 4, 2024 • edited Loading

mayank31398 commented Apr 4, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

younesbelkada left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented May 3, 2024

younesbelkada left a comment

Choose a reason for hiding this comment

mayank31398 commented May 3, 2024

mayank31398 commented Apr 4, 2024 •

edited

Loading