New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

CLI: add stricter automatic checks to `pt-to-tf` #17588

Merged

gante merged 9 commits into huggingface:main from gante:conv_model_head

Jun 8, 2022

Member

gante commented Jun 7, 2022 •

edited

Loading

What does this PR do?

Last week I introduced the pt-to-tf CLI (#17497), enabling automatic weight conversion followed by PR opening.

This PR makes four changes related to that CLI:

Uses the appropriate model class to load the model, to ensure the head's weights also get converted;
Adds much stricter checks -- ALL model outputs (with output_hidden_states=True) are verified;
Adds a flag to create new TF weights, even if they already exist;
Updates the docker file for the scheduled tests to install git lfs -- I did it for the circleci workflows in the original PR, but forgot to do it for the scheduled tests.

🚨 This also means I will double-check previously open Hub PRs (about 10), to confirm that the model head is present in the TF weights (I suspect it isn't in some cases 😢 ) and that the outputs pass the stricter tests.

For context, if the conversion fails because of a difference in the model outputs, we get a message like this one:

gante added 2 commits

June 7, 2022 15:11


          Stricter pt-to-tf checks; Update docker image for related tests

4aaa263


          check all attributes in the output

7a58886

gante requested review from sgugger, ydshieh and Rocketknight1

June 7, 2022 15:55

gante added 2 commits

June 7, 2022 15:57


          make fixup

ada8865


          remove type hints (were causing import errors)

692c19e

HuggingFaceDocBuilderDev commented Jun 7, 2022 •

edited

Loading

The documentation is not available anymore as the PR was closed or merged.

sgugger approved these changes

View reviewed changes

Collaborator

sgugger left a comment

Thanks for working on this!

src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved

src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved

src/transformers/commands/pt_to_tf.py Outdated Show resolved Hide resolved

gante and others added 3 commits

June 7, 2022 17:41


          Apply suggestions from code review

52c98ae

Co-authored-by: Sylvain Gugger <[email protected]>


          Added import exception; Added flag to create new TF weights

c9b78ac


          make fixup

8a9d873

ydshieh approved these changes

View reviewed changes

src/transformers/commands/pt_to_tf.py Outdated


		return max_difference, max_difference_source

		return compate_pt_tf_values(pt_outputs, tf_outputs)

Collaborator

ydshieh Jun 7, 2022 •

edited

Loading

compare_pt_tf_values 😄

Member Author

gante Jun 7, 2022 •

edited

Loading

Will rename to _compare_pt_tf_models (to avoid a name clash, as Matt mentioned)

src/transformers/commands/pt_to_tf.py Outdated

+                          raise ValueError("The model outputs have different attributes, aborting.")
+                      # 2. For each key, ALL values must be the same
+                      def compate_pt_tf_values(pt_out, tf_out, attr_name=""):

Collaborator

ydshieh Jun 7, 2022

compare_pt_tf_values ..?

src/transformers/commands/pt_to_tf.py

+                              for i, pt_item in enumerate(pt_out):
+                                  # If it is a named attribute, we keep the name. Otherwise, just its index.
+                                  if isinstance(pt_item, str):
+                                      branch_name = root_name + pt_item

Collaborator

ydshieh Jun 7, 2022

I feel that we will need to have something like f"{root_name}.{pt_item}", i.e. to include some kind of separator, so the result names will be more readable.

Member Author

gante Jun 7, 2022

There is no need, the names are not nested (at the moment). As it is structured, it will print the variable as we would write on a python terminal to get it, so we can copy-paste it for further inspection -- e.g. past_key_values[0][2]

ydshieh reviewed

View reviewed changes

src/transformers/commands/pt_to_tf.py Outdated

+                                  else:
+                                      branch_name = root_name + f"[{i}]"
+                                      tf_item = tf_out[i]
+                                  difference, difference_source = compate_pt_tf_values(pt_item, tf_item, branch_name)

Collaborator

ydshieh Jun 7, 2022 •

edited

Loading

compare_pt_tf_values ..?

Collaborator

ydshieh commented Jun 7, 2022

LGTM, just a few nits if they make sense.
It's indeed very important regarding the head, great catch, @gante !


          more verbose error messages

d6612ce

Rocketknight1 reviewed

View reviewed changes

src/transformers/commands/pt_to_tf.py Outdated

+                  @staticmethod
+                  def compare_pt_tf_models(pt_model, pt_input, tf_model, tf_input):
+                      """
+                      Compares the TensorFload and PyTorch models, given their inputs, returning a tuple with the maximum observed

Member

Rocketknight1 Jun 7, 2022

Suggested change

      
                    Compares the TensorFload and PyTorch models, given their inputs, returning a tuple with the maximum observed
          
                    Compares the TensorFlow and PyTorch models, given their inputs, returning a tuple with the maximum observed

Rocketknight1 reviewed

View reviewed changes

src/transformers/commands/pt_to_tf.py

-                      converted_diff = np.max(np.abs(pt_last_hidden_state - tf_last_hidden_state))
+                      del tf_from_pt_model  # will no longer be used, and may have a large memory footprint
+                      tf_model = tf_class.from_pretrained(self._local_dir)
+                      converted_diff, diff_source = self.compare_pt_tf_models(pt_model, pt_input, tf_model, tf_input)

Member

Rocketknight1 Jun 7, 2022

The function is called as compare_pt_tf_models here, but as @ydshieh mentioned it's defined as compate_pt_tf_models, so this bit will probably crash.

Collaborator

ydshieh Jun 7, 2022

It was my bad, my comment should be compate_pt_tf_values --> compare_pt_tf_values.

Nothing wrong about compare_pt_tf_models.

gante mentioned this pull request

TF: Merge PT and TF behavior for Bart when no decoder_input_ids are passed #17593

Merged


          Update fn name

c35417f

gante merged commit 78c695e into huggingface:main

gante deleted the conv_model_head branch

June 8, 2022 09:45

elusenji pushed a commit to elusenji/transformers that referenced this pull request


          CLI: add stricter automatic checks to pt-to-tf (huggingface#17588)

1dd1e37

* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <[email protected]>

amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request


          CLI: add stricter automatic checks to pt-to-tf (huggingface#17588)

46ddbe6

* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <[email protected]>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet