Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ConvNeXt models #16421

Merged
merged 22 commits into from
May 10, 2022
Merged

Add ConvNeXt models #16421

merged 22 commits into from
May 10, 2022

Conversation

sayakpaul
Copy link
Contributor

@sayakpaul sayakpaul commented Apr 16, 2022

Closes #16321

Conversion scripts and ImageNet-1k evaluation are available here: https://github.com/sayakpaul/keras-convnext-conversion.

Comparison to the actual reported numbers

name original acc@1 keras acc@1
convnext_tiny_1k_224 82.1 81.312
convnext_small_1k_224 83.1 82.392
convnext_base_21k_1k_224 85.8 85.364
convnext_large_21k_1k_224 86.6 86.36
convnext_xlarge_21k_1k_224 87.0 86.732

@LukeWood

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This is a great addition to applications.

Comments may not be exhaustive.

keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved
@sayakpaul sayakpaul marked this pull request as ready for review April 17, 2022 12:48
@sayakpaul
Copy link
Contributor Author

Once the implementation looks good, I will add the other components. Please find the model conversion and the evaluation details in this comment: #16421 (comment).

By "conversion" I mean the following:

  • I first implemented the models in Keras.
  • I then populated them with the pre-trained parameters.

Implementation correctness needs to be ensured in these cases, hence the evaluation.

@LukeWood LukeWood self-requested a review April 17, 2022 20:12
@google-ml-butler google-ml-butler bot added the keras-team-review-pending Pending review by a Keras team member. label Apr 17, 2022
Comment on lines 52 to 55
"xlarge":
("da65d1294d386c71aebd81bc2520b8d42f7f60eee4414806c60730cd63eb15cb",
"2bfbf5f0c2b3f004f1c32e9a76661e11a9ac49014ed2a68a49ecd0cd6c88d377"),
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have also converted the ImageNet-21 checkpoints which would likely be better for transfer learning than the checkpoints from ImageNet-1k pre-training.

But to add those checkpoints we need the following:

  • ImageNet-21k models are supposed to be multi-label classifiers. So the activation should be "sigmoid". So when weights="imagenet21k" && include_top=True, classifier_activation is supposed to be sigmoid.

This would require changes to imagenet_utils.validate_activation. As I understand it, it only supports softmax at the moment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, a better idea would be to keep the PR as it is. Once it's done, we could work on another PR setting up validation for sigmoid when loading pre-trained models like the one mentioned above. After that, I'm happy to work on another PR to incorporate the ImageNet-21k checkpoints making the changes necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, it's fine to only have support for imagenet1k for now (this is consistent with the other applications). We can add more checkpoints in the future.

@gbaned gbaned requested a review from fchollet April 18, 2022 10:32
keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved
Comment on lines 52 to 55
"xlarge":
("da65d1294d386c71aebd81bc2520b8d42f7f60eee4414806c60730cd63eb15cb",
"2bfbf5f0c2b3f004f1c32e9a76661e11a9ac49014ed2a68a49ecd0cd6c88d377"),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, it's fine to only have support for imagenet1k for now (this is consistent with the other applications). We can add more checkpoints in the future.

@sayakpaul
Copy link
Contributor Author

sayakpaul commented Apr 19, 2022

@fchollet just incorporated the changes you asked for.

These changes will require a re-conversion of the pre-trained parameters since the model structure has been changed a bit now. I will do that (as well as the ImageNet-1k evaluation) after others have had a chance to review the PR.

@sayakpaul sayakpaul requested a review from fchollet April 20, 2022 04:22
Copy link
Contributor

@LukeWood LukeWood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - just need to figure out the right end action per Francois' comments and add tests when that is done. Thanks @sayakpaul for the contribution!

@rchao rchao removed the keras-team-review-pending Pending review by a Keras team member. label Apr 21, 2022
Copy link
Contributor

@LukeWood LukeWood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed a more thorough pass to get this moving along. A few minor changes, then a big question regarding normalization that I think @fchollet will have context on.

keras/applications/convnext.py Outdated Show resolved Hide resolved
keras/applications/convnext.py Outdated Show resolved Hide resolved

def apply(x):
x = layers.Normalization(
mean=[0.485 * 255, 0.456 * 255, 0.406 * 255],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, these are initialized based on imagenet: this is required for use with the pretrained weights. Is there a way we can allow users to configure this for custom datasets?

@fchollet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's inspired from ResNet-RS and RegNets. For non-ImageNet weights, it's necessary to disable it.

keras/applications/convnext.py Outdated Show resolved Hide resolved
@sayakpaul
Copy link
Contributor Author

sayakpaul commented May 6, 2022

@LukeWood with the current setup, we have a problem.

https://github.com/sayakpaul/keras/blob/feat/convnext/keras/applications/applications_test.py#L129 will fail being unable to get instantiated from the config.

https://github.com/sayakpaul/keras/blob/feat/convnext/keras/applications/applications_test.py#L134-#L137 does not help.

Error trace (used separately to test the component separately):

Traceback (most recent call last):
  File "convert.py", line 249, in <module>
    main(args)
  File "convert.py", line 111, in main
    reconstructed_model = convnext_model_tf.__class__.from_config(config)
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 708, in from_config
    input_tensors, output_tensors, created_layers = reconstruct_from_config(
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 1326, in reconstruct_from_config
    process_layer(layer_data)
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 1308, in process_layer
    layer = deserialize_layer(layer_data, custom_objects=custom_objects)
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/layers/serialization.py", line 207, in deserialize
    return generic_utils.deserialize_keras_object(
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/utils/generic_utils.py", line 679, in deserialize_keras_object
    deserialized_obj = cls.from_config(
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/engine/training.py", line 2641, in from_config
    functional.reconstruct_from_config(config, custom_objects))
  File "/Users/sayakpaul/.local/bin/.virtualenvs/pytorch/lib/python3.8/site-packages/keras/engine/functional.py", line 1325, in reconstruct_from_config
    for layer_data in config['layers']:
KeyError: 'layers'

I am currently trying to wrap LayerScale as a separate layer so that ConvNeXtBlock class could be turned into a nested function such as done in RegNets (example).

If this works out well, then the problem is solved otherwise we'll have to brainstorm more.

@sayakpaul
Copy link
Contributor Author

@LukeWood I think I was able to make things work. I ran bazel test keras/applications/applications_test and it's passing successfully (Colab Notebook).

Here's what changed:

LayerScale has now become a layer so that we can stay with the Functional API. This simplifies the model design and stays in line with the other Keras applications.

One thing I couldn't understand is that without this with context, the test fails. It should have also complained about the StochasticDepth layer since it's also custom. Both these custom layers have get_config() overridden. So, I'm not sure if I defined the get_config() for LayerScale in the wrong way.

Let me know your thoughts on the recent changes.

P.S.: Weight conversion and verification have been performed on ImageNet-1k as well and they are all good. Refer here.

@sayakpaul sayakpaul requested a review from LukeWood May 6, 2022 07:51
@gbaned gbaned requested review from fchollet and removed request for fchollet May 6, 2022 14:52
@LukeWood LukeWood added the ready to pull Ready to be merged into the codebase label May 9, 2022
@LukeWood LukeWood removed the ready to pull Ready to be merged into the codebase label May 9, 2022
@LukeWood
Copy link
Contributor

LukeWood commented May 9, 2022

Hey @sayakpaul ! I uploaded the weights to our bucket. Now we can update the path in your code and merge the PR. Thanks!

@sayakpaul
Copy link
Contributor Author

@LukeWood done. Thank you!

@sayakpaul sayakpaul requested a review from LukeWood May 10, 2022 01:16
@LukeWood LukeWood added the ready to pull Ready to be merged into the codebase label May 10, 2022
@copybara-service copybara-service bot merged commit 3ff21f8 into keras-team:master May 10, 2022
@sayakpaul sayakpaul deleted the feat/convnext branch May 11, 2022 00:27
@AdityaKane2001
Copy link
Contributor

This is great to have in keras.applications!

@LukeWood
Copy link
Contributor

This is great to have in keras.applications!

super excited to have it in keras.applications!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keras-team-review-pending Pending review by a Keras team member. ready to pull Ready to be merged into the codebase size:L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ConvNeXt family of models to keras.applications
7 participants