Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[common] merge ChatGLM2Attention::forward into Attention::forward #86

Merged
merged 2 commits into from
Dec 1, 2023

Conversation

a3213105
Copy link
Contributor

add epsilon param into LayerNorm to align with RmsNorm to unify Norm APIs (LayerNorm doesn't use this param)

extend qk_shape from 4 to 5 in attention.h, to pass key_head_num which is needed by rotary_embedding_chatglm2 for multi-query-attention.

add epsilon param into LayerNorm to align with RmsNorm (LayerNorm doesn't use this param)

extend qk_shape from 4 to 5 in attention.h, to pass key_head_num into rotary_embedding_chatglm2 for multi-query-attention.
@a3213105 a3213105 changed the title merge ChatGLM2Attention::forward into Attention::forward [common] merge ChatGLM2Attention::forward into Attention::forward Nov 27, 2023
@Duyi-Wang
Copy link
Contributor

Please rebase to the main and fix the conflicts.

@a3213105
Copy link
Contributor Author

Please rebase to the main and fix the conflicts.

done

@Duyi-Wang Duyi-Wang merged commit 2830926 into intel:main Dec 1, 2023
1 check passed
abenmao pushed a commit to abenmao/xFasterTransformer that referenced this pull request Dec 4, 2023
add epsilon param into LayerNorm to align with RmsNorm (LayerNorm doesn't use this param)

extend qk_shape from 4 to 5 in attention.h, to pass key_head_num into rotary_embedding_chatglm2 for multi-query-attention.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants