Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

activation function in BERTIntermediate #15

Closed
lukovnikov opened this issue Nov 13, 2018 · 4 comments
Closed

activation function in BERTIntermediate #15

lukovnikov opened this issue Nov 13, 2018 · 4 comments

Comments

@lukovnikov
Copy link
Contributor

BERTConfig is not used for BERTIntermediate's activation function. intermediate_act_fn is always gelu. Is this normal?

https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/modeling.py#L240

@thomwolf
Copy link
Member

Yes, I hard coded that since the pre-trained models are all trained with gelu anyway.

@lukovnikov
Copy link
Contributor Author

lukovnikov commented Nov 13, 2018

ok. but since config is there anyway, isn't it cleaner to use it (to avoid errors for people using configs that use a different activation for some reason) ?

@thomwolf
Copy link
Member

Yes we can, I'll change that in the coming first release (unless you would like to submit a PR which I would be happy to merge).

@lukovnikov
Copy link
Contributor Author

yeah let me clean up and I'll PR

stevezheng23 added a commit to stevezheng23/transformers that referenced this issue Mar 24, 2020
* update kd qa in roberta modeling

* fix issues for kd-quac runner

* add adversarial training support for roberta question answering

* add adversarial training support for roberta question answering (cont.)
rraminen pushed a commit to rraminen/transformers that referenced this issue Oct 27, 2022
jlamypoirier added a commit to jlamypoirier/transformers that referenced this issue Apr 4, 2023
jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023
ocavue pushed a commit to ocavue/transformers that referenced this issue Sep 13, 2023
…/webpack-5.76.0

Bump webpack from 5.75.0 to 5.76.0
LysandreJik pushed a commit that referenced this issue Mar 15, 2024
* Cohere Model Release (#1)

Cohere Model Release

* Remove unnecessary files and code (#2)

Some cleanup

* Delete cohere-model directory (#3)

* Make Fix (#5)

* Pr fixes (#6)

* fixes for pr

* pr fixes for the format

* pr fixes for the format

* src/transformers/models/auto/tokenization_auto.py

* Tokenizer test (#8)

* tokenizer test

* format fix

* Adding Docs and other minor changes (#7)

* Add modeling tests (#9)

* Smol Fix (#11)

* tokenization tests are fixed

* format fixes

* fix pr doc tests

* fix pr doc tests

* fix pr doc tests

* fix pr style check

* small changes in cohere.md

* FIX: Address final comments for transformers integration (#13)

* fix modeling final nits and add proper test file

* for now leave empty tests

* add integration test

* push new test

* fix modeling cohere (#14)

* Update chat templates to use the new API (#15)

---------

Co-authored-by: ahmetustun <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Matt <[email protected]>
LysandreJik pushed a commit to LysandreJik/transformers that referenced this issue Apr 10, 2024
itazap pushed a commit that referenced this issue May 14, 2024
* Cohere Model Release (#1)

Cohere Model Release

* Remove unnecessary files and code (#2)

Some cleanup

* Delete cohere-model directory (#3)

* Make Fix (#5)

* Pr fixes (#6)

* fixes for pr

* pr fixes for the format

* pr fixes for the format

* src/transformers/models/auto/tokenization_auto.py

* Tokenizer test (#8)

* tokenizer test

* format fix

* Adding Docs and other minor changes (#7)

* Add modeling tests (#9)

* Smol Fix (#11)

* tokenization tests are fixed

* format fixes

* fix pr doc tests

* fix pr doc tests

* fix pr doc tests

* fix pr style check

* small changes in cohere.md

* FIX: Address final comments for transformers integration (#13)

* fix modeling final nits and add proper test file

* for now leave empty tests

* add integration test

* push new test

* fix modeling cohere (#14)

* Update chat templates to use the new API (#15)

---------

Co-authored-by: ahmetustun <[email protected]>
Co-authored-by: Younes Belkada <[email protected]>
Co-authored-by: Matt <[email protected]>
ArthurZucker pushed a commit that referenced this issue Sep 25, 2024
Interpolate embeddings for 560 size and update integration tests
ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024
ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024
* pkg depends update

* remove fused attention/mlp

* Remove triton v1 and cleanup unused fused files
ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants