Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI: add stricter automatic checks to pt-to-tf #17588

Merged
merged 9 commits into from
Jun 8, 2022

Conversation

gante
Copy link
Member

@gante gante commented Jun 7, 2022

What does this PR do?

Last week I introduced the pt-to-tf CLI (#17497), enabling automatic weight conversion followed by PR opening.

This PR makes four changes related to that CLI:

  1. Uses the appropriate model class to load the model, to ensure the head's weights also get converted;
  2. Adds much stricter checks -- ALL model outputs (with output_hidden_states=True) are verified;
  3. Adds a flag to create new TF weights, even if they already exist;
  4. Updates the docker file for the scheduled tests to install git lfs -- I did it for the circleci workflows in the original PR, but forgot to do it for the scheduled tests.

🚨 This also means I will double-check previously open Hub PRs (about 10), to confirm that the model head is present in the TF weights (I suspect it isn't in some cases 😢 ) and that the outputs pass the stricter tests.

For context, if the conversion fails because of a difference in the model outputs, we get a message like this one:
Screenshot 2022-06-07 at 17 00 12

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 7, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved
src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved
src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved

return max_difference, max_difference_source

return compate_pt_tf_values(pt_outputs, tf_outputs)
Copy link
Collaborator

@ydshieh ydshieh Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare_pt_tf_values 😄

Copy link
Member Author

@gante gante Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will rename to _compare_pt_tf_models (to avoid a name clash, as Matt mentioned)

raise ValueError("The model outputs have different attributes, aborting.")

# 2. For each key, ALL values must be the same
def compate_pt_tf_values(pt_out, tf_out, attr_name=""):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare_pt_tf_values ..?

for i, pt_item in enumerate(pt_out):
# If it is a named attribute, we keep the name. Otherwise, just its index.
if isinstance(pt_item, str):
branch_name = root_name + pt_item
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that we will need to have something like f"{root_name}.{pt_item}", i.e. to include some kind of separator, so the result names will be more readable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need, the names are not nested (at the moment). As it is structured, it will print the variable as we would write on a python terminal to get it, so we can copy-paste it for further inspection -- e.g. past_key_values[0][2]

else:
branch_name = root_name + f"[{i}]"
tf_item = tf_out[i]
difference, difference_source = compate_pt_tf_values(pt_item, tf_item, branch_name)
Copy link
Collaborator

@ydshieh ydshieh Jun 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compare_pt_tf_values ..?

@ydshieh
Copy link
Collaborator

ydshieh commented Jun 7, 2022

LGTM, just a few nits if they make sense.
It's indeed very important regarding the head, great catch, @gante !

@staticmethod
def compare_pt_tf_models(pt_model, pt_input, tf_model, tf_input):
"""
Compares the TensorFload and PyTorch models, given their inputs, returning a tuple with the maximum observed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Compares the TensorFload and PyTorch models, given their inputs, returning a tuple with the maximum observed
Compares the TensorFlow and PyTorch models, given their inputs, returning a tuple with the maximum observed

converted_diff = np.max(np.abs(pt_last_hidden_state - tf_last_hidden_state))
del tf_from_pt_model # will no longer be used, and may have a large memory footprint
tf_model = tf_class.from_pretrained(self._local_dir)
converted_diff, diff_source = self.compare_pt_tf_models(pt_model, pt_input, tf_model, tf_input)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is called as compare_pt_tf_models here, but as @ydshieh mentioned it's defined as compate_pt_tf_models, so this bit will probably crash.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was my bad, my comment should be compate_pt_tf_values --> compare_pt_tf_values.

Nothing wrong about compare_pt_tf_models.

@gante gante merged commit 78c695e into huggingface:main Jun 8, 2022
@gante gante deleted the conv_model_head branch June 8, 2022 09:45
elusenji pushed a commit to elusenji/transformers that referenced this pull request Jun 12, 2022
* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <[email protected]>
amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Jun 16, 2022
* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants