Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Parameterized norm freezing #32631

Merged
merged 2 commits into from
Aug 19, 2024

Conversation

AlanBlanchet
Copy link
Contributor

For the R18 model, the authors don't freeze norms in the backbone.

What does this PR do?

Fixes # (issue)

#32604

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). No.
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests? I the prebuilt ones locally

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@qubvel

For the R18 model, the authors don't freeze norms in the backbone.
Copy link
Member

@qubvel qubvel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, looks good to me! Can you please run make modified_only_fixup to fix code style issues?

src/transformers/models/rt_detr/configuration_rt_detr.py Outdated Show resolved Hide resolved
@AlanBlanchet
Copy link
Contributor Author

Sorry for the delay. Should be good

@qubvel
Copy link
Member

qubvel commented Aug 13, 2024

Cool! I've tried finetuning, works fine even without excluding weight decay for batchnorms. The only concern is that while loading the model I get the warning, probably because num_batches_tracked data was lost from the original checkpoint. We can probably leave freeze_backbone_batch_norms=True by default for all models, and let the user specify this for fine-tuning.

from transformers import AutoModelForObjectDetection

model = AutoModelForObjectDetection.from_pretrained(
    "PekingU/rtdetr_r18vd",
    freeze_backbone_batch_norms=False,
)

Some weights of RTDetrForObjectDetection were not initialized from the model checkpoint at PekingU/rtdetr_r18vd and are newly initialized: ['model.backbone.model.embedder.embedder.0.normalization.num_batches_tracked', 'model.backbone.model.embedder.embedder.1.normalization.num_batches_tracked', 'model.backbone.model.embedder.embedder.2.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.shortcut.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.1.layer.1.normalization.num_batches_tracked']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

@qubvel
Copy link
Member

qubvel commented Aug 19, 2024

@amyeroberts what do you think? I suggest not changing configs on the hub for backward compatibility and to avoid weight conversion, just let the user use freeze_backbone_batch_norms=False in case it's needed.

@amyeroberts
Copy link
Collaborator

amyeroberts commented Aug 19, 2024

@amyeroberts what do you think? I suggest not changing configs on the hub for backward compatibility and to avoid weight conversion, just let the user use freeze_backbone_batch_norms=False in case it's needed.

@qubvel Agreed!

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this fix!

@amyeroberts
Copy link
Collaborator

@AlanBlanchet One thing which would be useful for users is to have this documented on the model's hub README. If you could open up a PR to add that and share the link we'd be happy to do a quick review so people know this feature exists!

@AlanBlanchet
Copy link
Contributor Author

Hello @amyeroberts .
Sure ! I'll open a PR on the model hub

@qubvel qubvel merged commit 5f6c080 into huggingface:main Aug 19, 2024
18 checks passed
@AlanBlanchet
Copy link
Contributor Author

@amyeroberts Here is the link.
I'll add it to the other versions after this get's approved

@AlanBlanchet AlanBlanchet deleted the feature/no-norm-freezing-r18 branch August 20, 2024 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants