Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF: Merge PT and TF behavior for Bart when no decoder_input_ids are passed #17593

Merged
merged 3 commits into from
Jun 8, 2022

Conversation

gante
Copy link
Member

@gante gante commented Jun 7, 2022

What does this PR do?

The main model for TF and PT have a different behavior when no decoder_input_ids are passed. Because of that, the same model with the same inputs has different outputs in the two platforms, in this specific input case.

Looking at the git blame, it seems like this if branch in PT has added after the TF model was added, and we possibly forgot to port it back.

Notes:

  1. All slow tests for Bart were run locally;
  2. This was caught with the updated pt-to-tf CLI, which added much stricter tests.

if decoder_input_ids is None and decoder_inputs_embeds is None:
use_cache = False

output_hidden_states = (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFBartMainLayer (i.e. this class) is not a stand-alone class, output_hidden_states is always passed from the stand-alone classes. This line was redundant and thus removed.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 7, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I love the idea of this tool shaking out a load of bugs, too.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean-up - thanks!

@gante gante merged commit e9d5138 into huggingface:main Jun 8, 2022
@gante gante deleted the use_cache_bart branch June 8, 2022 16:42
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants