Add TF ResNet model #17427

amyeroberts · 2022-05-25T20:02:24Z

Adds a tensorflow implementation of the ResNet model + associated tests.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2022-05-25T20:14:22Z

The documentation is not available anymore as the PR was closed or merged.

src/transformers/models/resnet/modeling_tf_resnet.py

src/transformers/models/resnet/configuration_resnet.py

tests/models/resnet/test_modeling_tf_resnet.py

src/transformers/models/resnet/configuration_resnet.py

tests/models/resnet/test_modeling_tf_resnet.py

Co-authored-by: Sayak Paul <[email protected]>

tests/models/resnet/test_modeling_tf_resnet.py

src/transformers/modeling_tf_outputs.py

amyeroberts · 2022-06-22T18:53:14Z

I'm seeing a failure in test_keras_fit - it looks like the outputs are different depending on whether the labels are passed in the input dict or separately. That might actually have nothing to do with the labels and instead be caused by some random differences in the model outputs, though - maybe the training flag isn't being passed correctly so layers like dropout are still being run in training mode during eval time? Alternatively, maybe the tolerances we use for NLP models are just too strict for this one?

@Rocketknight1 Digging into this - I believe this is because of the batch norm layers. Every time the layer is called it updates its moving_mean and moving_variance parameters. During training, the batches are normalised based on the batch stats, which will be exactly the same for both fit calls, because the data isn't shuffled. And we see this - the training loss for the two histories in test_keras_fit are exactly the same. However, at inference the batches are normalised based on the moving_mean and moving_var params. I'm not really sure how to address this. @ydshieh have we handled anything like this with tests before?

Weirdly, the test was passing before. I'm guessing just a fluke?

Rocketknight1 · 2022-06-22T19:11:51Z

Ahhh, of course! I had thought that running a single iteration of training with a learning rate of 0 would leave the weights unchanged, but that isn't true for BatchNorm, because BatchNorm weights aren't updated by gradient descent. The test was broken and we only got away with it because NLP models generally don't use BatchNorm. I'll fix it tomorrow!

amyeroberts · 2022-06-23T16:08:38Z

@sgugger Sorry - I didn't mean to re-request for you as you'd already approved!

amyeroberts · 2022-06-28T10:09:47Z

tests/models/resnet/test_modeling_tf_resnet.py

+    @slow
+    def test_model_from_pretrained(self):
+        for model_name in TF_RESNET_PRETRAINED_MODEL_ARCHIVE_LIST[:1]:
+            model = TFResNetModel.from_pretrained(model_name, from_pt=True)


To remove once all approved and weights pushed to hub

amyeroberts · 2022-06-28T10:09:52Z

tests/models/resnet/test_modeling_tf_resnet.py

+    @slow
+    def test_inference_image_classification_head(self):
+        model = TFResNetForImageClassification.from_pretrained(
+            TF_RESNET_PRETRAINED_MODEL_ARCHIVE_LIST[0], from_pt=True


To remove once all approved and weights pushed to hub

amyeroberts · 2022-06-28T10:10:58Z

@NielsRogge @Rocketknight1 friendly nudge - let me know if there's any other changes or if I'm good to merge :)

Rocketknight1

Overall looks good - my main complaint is that I think we could swap the data format for some Conv and MaxPool layers to improve performance and save memory by skipping some transposes!

Rocketknight1 · 2022-06-28T12:17:53Z

src/transformers/models/resnet/modeling_tf_resnet.py

+        if tf.executing_eagerly() and num_channels != self.num_channels:
+            raise ValueError(
+                "Make sure that the channel dimension of the pixel values match with the one set in the configuration."
+            )


I think this check should be okay without tf.executing_eagerly(), since it will still work if it's encountered during function tracing, right?

I added this as @ydshieh suggested to all vision models for which we can check the number of channels.

I did get an error for one of the tests ofTFViTMAE when not adding this check (during save_pretrained).

The following 2 tests (in test_modeling_tf_core.py) will fail

test_saved_model_creation

test_saved_model_creation_extended

These tests are only run for a few core models, but I think it is good to keep them working.

(I originally introduced these kinds of conditions when adding TF ViT in order to keep @tooslow tests working)

Alright, that makes sense. It can stay!

Rocketknight1 · 2022-06-28T12:24:34Z

src/transformers/models/resnet/modeling_tf_resnet.py

+        hidden_state = tf.transpose(hidden_state, (0, 2, 3, 1))
+        hidden_state = self.convolution(hidden_state)
+        # (batch_size, height, width, num_channels) -> (batch_size, num_channels, height, width)
+        hidden_state = tf.transpose(hidden_state, (0, 3, 1, 2))


Could we just do an NCHW convolution here (data_format="channels_first" for the Conv2D layer) instead of transpose -> NHWC convolution -> untranspose? I believe TF stores the weights in the same shape (H, W, in_dim, out_dim) regardless, so it shouldn't affect crossloading.

The reason I have it this way is that Conv2D can't run on CPU with channels_first . You get this error:

The Conv2D op currently only supports the NHWC tensor format on the CPU. The op was given the format: NCHW [Op:Conv2D]```

src/transformers/models/resnet/modeling_tf_resnet.py

Rocketknight1 · 2022-06-28T12:29:25Z

src/transformers/models/resnet/modeling_tf_resnet.py

+        hidden_state = tf.transpose(hidden_state, (0, 2, 3, 1))
+        hidden_state = self.conv(hidden_state)
+        # (batch_size, height, width, num_channels) -> (batch_size, num_channels, height, width)
+        hidden_state = tf.transpose(hidden_state, (0, 3, 1, 2))


Could we just do an NCHW convolution here (data_format="channels_first" for the Conv2D layer) instead of transpose -> NHWC convolution -> untranspose? I believe TF stores the weights in the same shape (H, W, in_dim, out_dim) regardless, so it shouldn't affect crossloading.

Rocketknight1 · 2022-06-28T12:33:07Z

src/transformers/models/resnet/modeling_tf_resnet.py

+        hidden_state = tf.transpose(hidden_state, (0, 2, 3, 1))
+        hidden_state = self.pooler(hidden_state)
+        # (batch_size, height, width, num_channels) -> (batch_size, num_channels, height, width)
+        hidden_state = tf.transpose(hidden_state, (0, 3, 1, 2))


Could also change the data format on the pooler to channels_first to avoid needing to transpose and untranspose.

MaxPool doesn't work with channels_first on CPU either 😞

Default MaxPoolingOp only supports NHWC on device type CPU [Op:MaxPool]

God I hate Tensorflow

Co-authored-by: matt <[email protected]> Co-authored-by: Matt <[email protected]>

…nto add-tf-resnet

@Rocketknight1

* Rought TF conversion outline * Tidy up * Fix padding differences between layers * Add back embedder - whoops * Match test file to main * Match upstream test file * Correctly pass and assign image_size parameter Co-authored-by: Sayak Paul <[email protected]> * Add in MainLayer * Correctly name layer * Tidy up AdaptivePooler * Small tidy-up More accurate type hints and remove whitespaces * Change AdaptiveAvgPool Use the AdaptiveAvgPool implementation by @Rocketknight1, which correctly pools if the output shape does not evenly divide by input shape c.f. https://github.com/huggingface/transformers/pull/17554/files/9e26607e22aa8d069c86b50196656012ff0ce62a#r900109509 Co-authored-by: From: matt <[email protected]> Co-authored-by: Sayak Paul <[email protected]> * Use updated AdaptiveAvgPool Co-authored-by: matt <[email protected]> * Make AdaptiveAvgPool compatible with CPU * Remove image_size from configuration * Fixup * Tensorflow -> TensorFlow * Fix pt references in tests * Apply suggestions from code review - grammar and wording Co-authored-by: NielsRogge <[email protected]> Co-authored-by: NielsRogge <[email protected]> * Add TFResNet to doc tests * PR comments - GlobalAveragePooling and clearer comments * Remove unused import * Add in keepdims argument * Add num_channels check * grammar fix: by -> of Co-authored-by: matt <[email protected]> Co-authored-by: Matt <[email protected]> * Remove transposes - keep NHWC throughout forward pass * Fixup look sharp * Add missing layer names * Final tidy up - remove from_pt now weights on hub Co-authored-by: Sayak Paul <[email protected]> Co-authored-by: matt <[email protected]> Co-authored-by: NielsRogge <[email protected]> Co-authored-by: Matt <[email protected]>

amyeroberts and others added 2 commits May 25, 2022 20:53

Rought TF conversion outline

b8144f2

Merge branch 'main' into add-tf-resnet

261860e

amyeroberts added 3 commits May 26, 2022 11:33

Tidy up

0313ab4

Merge branch 'main' into add-tf-resnet

373a549

Merge branch 'main' into add-tf-resnet

4d0ece8

sayakpaul mentioned this pull request Jun 3, 2022

[WIP] Add ResNets in TF #17536

Closed

amyeroberts added 7 commits June 6, 2022 11:27

Merge branch 'main' into add-tf-resnet

966d196

Fix padding differences between layers

f44f1a3

Merge branch 'main' into add-tf-resnet

5dd5205

Resolve conflicts

ec5122b

Merge branch 'main' into add-tf-resnet

d48777d

Add back embedder - whoops

3bd197b

Merge in main

ccc06ec

amyeroberts added the New model label Jun 10, 2022

amyeroberts added 2 commits June 10, 2022 18:54

Match test file to main

a8664a1

Match upstream test file

456a598

sayakpaul reviewed Jun 13, 2022

View reviewed changes

src/transformers/models/resnet/modeling_tf_resnet.py Show resolved Hide resolved

sayakpaul reviewed Jun 13, 2022

View reviewed changes

src/transformers/models/resnet/configuration_resnet.py Outdated Show resolved Hide resolved

sayakpaul reviewed Jun 13, 2022

View reviewed changes

tests/models/resnet/test_modeling_tf_resnet.py Show resolved Hide resolved

amyeroberts commented Jun 13, 2022

View reviewed changes

src/transformers/models/resnet/configuration_resnet.py Outdated Show resolved Hide resolved

amyeroberts commented Jun 13, 2022

View reviewed changes

tests/models/resnet/test_modeling_tf_resnet.py Show resolved Hide resolved

amyeroberts and others added 3 commits June 13, 2022 13:27

Correctly pass and assign image_size parameter

a5584e1

Co-authored-by: Sayak Paul <[email protected]>

Add in MainLayer

3cd431b

Correctly name layer

68ca81d

amyeroberts mentioned this pull request Jun 13, 2022

Swin main layer #17693

Merged

5 tasks

amyeroberts commented Jun 14, 2022

View reviewed changes

tests/models/resnet/test_modeling_tf_resnet.py Outdated Show resolved Hide resolved

amyeroberts commented Jun 14, 2022

View reviewed changes

src/transformers/modeling_tf_outputs.py Outdated Show resolved Hide resolved

amyeroberts commented Jun 14, 2022

View reviewed changes

src/transformers/modeling_tf_outputs.py Outdated Show resolved Hide resolved

amyeroberts commented Jun 14, 2022

View reviewed changes

src/transformers/modeling_tf_outputs.py Outdated Show resolved Hide resolved

amyeroberts added 2 commits June 22, 2022 19:40

Remove unused import

1eb8b44

Add in keepdims argument

056a27f

sayakpaul mentioned this pull request Jun 23, 2022

TF implementation of RegNets #17554

Merged

Rocketknight1 mentioned this pull request Jun 23, 2022

Fix broken test for models with batchnorm #17841

Merged

Merge branch 'main' into add-tf-resnet

6ecaf39

amyeroberts requested review from Rocketknight1, sgugger and NielsRogge June 23, 2022 16:07

Add num_channels check

43235dc

amyeroberts commented Jun 28, 2022

View reviewed changes

Rocketknight1 approved these changes Jun 28, 2022

View reviewed changes

amyeroberts and others added 11 commits June 28, 2022 14:04

grammar fix: by -> of

07e2606

Co-authored-by: matt <[email protected]> Co-authored-by: Matt <[email protected]>

Merge branch 'main' into add-tf-resnet

7481301

Remove transposes - keep NHWC throughout forward pass

ef1df6d

Merge branch 'add-tf-resnet' of github.com:amyeroberts/transformers i…

d1aaf59

…nto add-tf-resnet

Fixup look sharp

1bf76d1

Merge branch 'main' into add-tf-resnet

43b3326

Resolve conflicts after RegNet merge

117ae19

Merge branch 'main' into add-tf-resnet

583583b

Merge branch 'main' into add-tf-resnet

810d643

Add missing layer names

b47480b

Final tidy up - remove from_pt now weights on hub

dc963f1

amyeroberts merged commit 77ea513 into huggingface:main Jul 4, 2022

amyeroberts deleted the add-tf-resnet branch July 4, 2022 09:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add TF ResNet model #17427

Add TF ResNet model #17427

amyeroberts commented May 25, 2022

HuggingFaceDocBuilderDev commented May 25, 2022 •

edited

Loading

amyeroberts commented Jun 22, 2022

Rocketknight1 commented Jun 22, 2022

amyeroberts commented Jun 23, 2022

amyeroberts Jun 28, 2022

amyeroberts Jun 28, 2022

amyeroberts commented Jun 28, 2022

Rocketknight1 left a comment

Rocketknight1 Jun 28, 2022

NielsRogge Jun 28, 2022

ydshieh Jun 28, 2022 •

edited

Loading

Rocketknight1 Jun 28, 2022

Rocketknight1 Jun 28, 2022 •

edited

Loading

amyeroberts Jun 28, 2022

Rocketknight1 Jun 28, 2022

Rocketknight1 Jun 28, 2022

amyeroberts Jun 28, 2022

Rocketknight1 Jun 28, 2022

sayakpaul Jun 28, 2022

Add TF ResNet model #17427

Add TF ResNet model #17427

Conversation

amyeroberts commented May 25, 2022

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented May 25, 2022 • edited Loading

amyeroberts commented Jun 22, 2022

Rocketknight1 commented Jun 22, 2022

amyeroberts commented Jun 23, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts commented Jun 28, 2022

Rocketknight1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ydshieh Jun 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Rocketknight1 Jun 28, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented May 25, 2022 •

edited

Loading

ydshieh Jun 28, 2022 •

edited

Loading

Rocketknight1 Jun 28, 2022 •

edited

Loading