Fix FlaxBigBirdEmbeddings #17842

ydshieh · 2022-06-23T13:36:44Z

What does this PR do?

Current FlaxBigBirdEmbeddings applies layer norm before dropout, while BigBirdEmbeddings and Google's original BigBird
applies dropout first. This PR fixes this inconsistency.

Flax
(layernorm --> dropout)

transformers/src/transformers/models/big_bird/modeling_flax_big_bird.py

Lines 232 to 233 in 6f29029

    
           hidden_states = self.LayerNorm(hidden_states) 
        
           hidden_states = self.dropout(hidden_states, deterministic=deterministic)

PyTorch
(dropout immediately after embedding)

transformers/src/transformers/models/big_bird/modeling_big_bird.py

Lines 311 to 312 in 6f29029

    
           embeddings = self.dropout(embeddings) 
        
           embeddings = self.LayerNorm(embeddings)

Google
(dropout immediately after embedding)
https://github.com/google-research/bigbird/blob/5f2a5aa7fbab23e32e0e0b41c5f0192f0c023e05/bigbird/core/utils.py#L565-L566

HuggingFaceDocBuilderDev · 2022-06-23T13:50:10Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten

Thanks for fixing - also cc @vasudevgupta7 can you confirm?

patil-suraj

Thanks for the fix!

ydshieh · 2022-07-01T01:53:02Z

I will merge this PR today.

Co-authored-by: ydshieh <[email protected]>

ydshieh requested review from patrickvonplaten and patil-suraj June 23, 2022 13:37

patrickvonplaten approved these changes Jun 24, 2022

View reviewed changes

patil-suraj approved these changes Jun 24, 2022

View reviewed changes

fix order

3a4d51c

ydshieh force-pushed the fix_bigbird_embedding branch from 62d6556 to 3a4d51c Compare June 24, 2022 17:33

ydshieh merged commit 8bb2c38 into huggingface:main Jul 1, 2022

ydshieh deleted the fix_bigbird_embedding branch July 1, 2022 14:46

viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022

Fix FlaxBigBirdEmbeddings (huggingface#17842)

d232d4a

Co-authored-by: ydshieh <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix FlaxBigBirdEmbeddings #17842

Fix FlaxBigBirdEmbeddings #17842

ydshieh commented Jun 23, 2022

HuggingFaceDocBuilderDev commented Jun 23, 2022 •

edited

Loading

patrickvonplaten left a comment

patil-suraj left a comment

ydshieh commented Jul 1, 2022

	hidden_states = self.LayerNorm(hidden_states)
	hidden_states = self.dropout(hidden_states, deterministic=deterministic)

	embeddings = self.dropout(embeddings)
	embeddings = self.LayerNorm(embeddings)

Fix FlaxBigBirdEmbeddings #17842

Fix FlaxBigBirdEmbeddings #17842

Conversation

ydshieh commented Jun 23, 2022

What does this PR do?

HuggingFaceDocBuilderDev commented Jun 23, 2022 • edited Loading

patrickvonplaten left a comment

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

ydshieh commented Jul 1, 2022

HuggingFaceDocBuilderDev commented Jun 23, 2022 •

edited

Loading