activation function in BERTIntermediate #15

lukovnikov · 2018-11-13T15:09:33Z

BERTConfig is not used for BERTIntermediate's activation function. intermediate_act_fn is always gelu. Is this normal?

https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/modeling.py#L240

The text was updated successfully, but these errors were encountered:

thomwolf · 2018-11-13T15:11:03Z

Yes, I hard coded that since the pre-trained models are all trained with gelu anyway.

lukovnikov · 2018-11-13T15:14:35Z

ok. but since config is there anyway, isn't it cleaner to use it (to avoid errors for people using configs that use a different activation for some reason) ?

thomwolf · 2018-11-13T15:17:39Z

Yes we can, I'll change that in the coming first release (unless you would like to submit a PR which I would be happy to merge).

lukovnikov · 2018-11-13T15:18:30Z

yeah let me clean up and I'll PR

* update kd qa in roberta modeling * fix issues for kd-quac runner * add adversarial training support for roberta question answering * add adversarial training support for roberta question answering (cont.)

…r-2022-08-09 IFU-master-2022-08-09

Ra

…/webpack-5.76.0 Bump webpack from 5.75.0 to 5.76.0

* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by: ahmetustun <[email protected]> Co-authored-by: Younes Belkada <[email protected]> Co-authored-by: Matt <[email protected]>

Interpolate embeddings for 560 size and update integration tests

Support MOSS model

* pkg depends update * remove fused attention/mlp * Remove triton v1 and cleanup unused fused files

thomwolf closed this as completed Nov 13, 2018

maeotaku mentioned this issue May 23, 2019

bert->onnx ->caffe2 weird error #633

Closed

HongyanJiao mentioned this issue Sep 19, 2019

traced_model #1291

Closed

philschmid mentioned this issue Jan 4, 2021

Add-support-for-examples-scripts-to-run-on-sagemaker #9367

Closed

manchandasahil mentioned this issue Mar 22, 2021

Longformer training : CUDA error: device-side assert triggered #10852

Closed

2 tasks

yananchen1989 mentioned this issue Aug 20, 2021

Bugs when fine tuning the gpt2 #12965

Closed

rraminen pushed a commit to rraminen/transformers that referenced this issue Oct 27, 2022

Merge pull request huggingface#15 from ROCmSoftwarePlatform/IFU-maste…

8097220

…r-2022-08-09 IFU-master-2022-08-09

jlamypoirier added a commit to jlamypoirier/transformers that referenced this issue Apr 4, 2023

Support generic models and arguments (huggingface#15)

5ef38b9

jameshennessytempus pushed a commit to jameshennessytempus/transformers that referenced this issue Jun 1, 2023

Merge pull request huggingface#15 from jamesthesnake/ra

73f8ebf

Ra

lwmlyy mentioned this issue Aug 15, 2023

add util for ram efficient loading of model when using fsdp #25107

Merged

1 task

ocavue pushed a commit to ocavue/transformers that referenced this issue Sep 13, 2023

Merge pull request huggingface#15 from xenova/dependabot/npm_and_yarn…

f8a6f65

…/webpack-5.76.0 Bump webpack from 5.75.0 to 5.76.0

LysandreJik pushed a commit to LysandreJik/transformers that referenced this issue Apr 10, 2024

Update chat templates to use the new API (huggingface#15)

24a2227

ArthurZucker pushed a commit that referenced this issue Sep 25, 2024

Merge pull request #15 from huggingface/embed-interpolation

554ea46

Interpolate embeddings for 560 size and update integration tests

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024

Merge pull request huggingface#15 from PanQiWei/support_moss

1c98811

Support MOSS model

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024

V3 remove triton v1 (huggingface#15)

249bfcc

* pkg depends update * remove fused attention/mlp * Remove triton v1 and cleanup unused fused files

ZYC-ModelCloud pushed a commit to ZYC-ModelCloud/transformers that referenced this issue Nov 14, 2024

delete remove_unwanted_pytorch_nvcc_flags() (huggingface#15)

5674b02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

activation function in BERTIntermediate #15

activation function in BERTIntermediate #15

lukovnikov commented Nov 13, 2018

thomwolf commented Nov 13, 2018

lukovnikov commented Nov 13, 2018 •

edited

Loading

thomwolf commented Nov 13, 2018

lukovnikov commented Nov 13, 2018

activation function in BERTIntermediate #15

activation function in BERTIntermediate #15

Comments

lukovnikov commented Nov 13, 2018

thomwolf commented Nov 13, 2018

lukovnikov commented Nov 13, 2018 • edited Loading

thomwolf commented Nov 13, 2018

lukovnikov commented Nov 13, 2018

lukovnikov commented Nov 13, 2018 •

edited

Loading