Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor the positional emebdding config code #4920

Merged
merged 5 commits into from
Jan 10, 2024
Merged

Conversation

arashb
Copy link
Contributor

@arashb arashb commented Jan 8, 2024

The Mixtral PR #4828 has introduced the positional embedding config class which is a required argument of make_attn_layer() function. This has forced the user to override and duplicate the make_attn_layer() call for new model implementations using RoPE (This has also broken the Falcon model implementations). This PR:

  • refactors the inference transformer base class to avoid code duplication by adding a new abstract positional_embedding_config property
  • Fixes the Falcon model implementation to use positional embedding config.

The models llama_v2, OPT, Mistral 7B, Mixtral, Falcon and Phi-2 are tested with the PR!

@mrwyattii mrwyattii merged commit c1e0205 into master Jan 10, 2024
9 checks passed
@mrwyattii mrwyattii deleted the arashb/fix-falcon branch January 10, 2024 17:33
loadams added a commit that referenced this pull request Jan 10, 2024
mrwyattii added a commit that referenced this pull request Jan 23, 2024
follow PR #4920 on Qwen inference code

Co-authored-by: Michael Wyatt <[email protected]>
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
The Mixtral PR microsoft#4828 has
introduced the positional embedding config class which is a required
argument of `make_attn_layer()` function. This has forced the user to
override and duplicate the `make_attn_layer()` call for new model
implementations using RoPE (This has also broken the Falcon model
implementations). This PR:

- refactors the inference transformer base class to avoid code
duplication by adding a new abstract `positional_embedding_config`
property
- Fixes the Falcon model implementation to use positional embedding
config.

The models `llama_v2`, `OPT`, `Mistral 7B`, `Mixtral`, `Falcon` and
`Phi-2` are tested with the PR!

---------

Co-authored-by: Logan Adams <[email protected]>
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
follow PR microsoft#4920 on Qwen inference code

Co-authored-by: Michael Wyatt <[email protected]>
rraminen pushed a commit to ROCm/DeepSpeed that referenced this pull request May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants