fix: Parameterized norm freezing #32631

AlanBlanchet · 2024-08-12T15:18:26Z

For the R18 model, the authors don't freeze norms in the backbone.

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). No.
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests? I the prebuilt ones locally

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@qubvel

For the R18 model, the authors don't freeze norms in the backbone.

qubvel

Thanks for the PR, looks good to me! Can you please run make modified_only_fixup to fix code style issues?

src/transformers/models/rt_detr/configuration_rt_detr.py

Co-authored-by: Pavel Iakubovskii <[email protected]>

AlanBlanchet · 2024-08-13T08:02:41Z

Sorry for the delay. Should be good

qubvel · 2024-08-13T17:40:12Z

Cool! I've tried finetuning, works fine even without excluding weight decay for batchnorms. The only concern is that while loading the model I get the warning, probably because num_batches_tracked data was lost from the original checkpoint. We can probably leave freeze_backbone_batch_norms=True by default for all models, and let the user specify this for fine-tuning.

from transformers import AutoModelForObjectDetection

model = AutoModelForObjectDetection.from_pretrained(
    "PekingU/rtdetr_r18vd",
    freeze_backbone_batch_norms=False,
)

Some weights of RTDetrForObjectDetection were not initialized from the model checkpoint at PekingU/rtdetr_r18vd and are newly initialized: ['model.backbone.model.embedder.embedder.0.normalization.num_batches_tracked', 'model.backbone.model.embedder.embedder.1.normalization.num_batches_tracked', 'model.backbone.model.embedder.embedder.2.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.0.shortcut.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.0.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.1.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.2.layers.1.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.layer.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.0.shortcut.1.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.1.layer.0.normalization.num_batches_tracked', 'model.backbone.model.encoder.stages.3.layers.1.layer.1.normalization.num_batches_tracked']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

qubvel · 2024-08-19T12:20:32Z

@amyeroberts what do you think? I suggest not changing configs on the hub for backward compatibility and to avoid weight conversion, just let the user use freeze_backbone_batch_norms=False in case it's needed.

amyeroberts · 2024-08-19T13:11:42Z

@amyeroberts what do you think? I suggest not changing configs on the hub for backward compatibility and to avoid weight conversion, just let the user use freeze_backbone_batch_norms=False in case it's needed.

@qubvel Agreed!

amyeroberts

Thanks for adding this fix!

amyeroberts · 2024-08-19T13:17:04Z

@AlanBlanchet One thing which would be useful for users is to have this documented on the model's hub README. If you could open up a PR to add that and share the link we'd be happy to do a quick review so people know this feature exists!

AlanBlanchet · 2024-08-19T13:36:27Z

Hello @amyeroberts .
Sure ! I'll open a PR on the model hub

AlanBlanchet · 2024-08-20T09:45:13Z

@amyeroberts Here is the link.
I'll add it to the other versions after this get's approved

fix: Parameterized norm freezing

40274d3

For the R18 model, the authors don't freeze norms in the backbone.

qubvel reviewed Aug 12, 2024

View reviewed changes

src/transformers/models/rt_detr/configuration_rt_detr.py Outdated Show resolved Hide resolved

Update src/transformers/models/rt_detr/configuration_rt_detr.py

4cc6e71

Co-authored-by: Pavel Iakubovskii <[email protected]>

amyeroberts approved these changes Aug 19, 2024

View reviewed changes

qubvel merged commit 5f6c080 into huggingface:main Aug 19, 2024
18 checks passed

AlanBlanchet deleted the feature/no-norm-freezing-r18 branch August 20, 2024 12:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Parameterized norm freezing #32631

fix: Parameterized norm freezing #32631

AlanBlanchet commented Aug 12, 2024

qubvel left a comment

AlanBlanchet commented Aug 13, 2024

qubvel commented Aug 13, 2024 •

edited

Loading

qubvel commented Aug 19, 2024

amyeroberts commented Aug 19, 2024 •

edited

Loading

amyeroberts left a comment

amyeroberts commented Aug 19, 2024

AlanBlanchet commented Aug 19, 2024

AlanBlanchet commented Aug 20, 2024

fix: Parameterized norm freezing #32631

fix: Parameterized norm freezing #32631

Conversation

AlanBlanchet commented Aug 12, 2024

What does this PR do?

Before submitting

Who can review?

qubvel left a comment

Choose a reason for hiding this comment

AlanBlanchet commented Aug 13, 2024

qubvel commented Aug 13, 2024 • edited Loading

qubvel commented Aug 19, 2024

amyeroberts commented Aug 19, 2024 • edited Loading

amyeroberts left a comment

Choose a reason for hiding this comment

amyeroberts commented Aug 19, 2024

AlanBlanchet commented Aug 19, 2024

AlanBlanchet commented Aug 20, 2024

qubvel commented Aug 13, 2024 •

edited

Loading

amyeroberts commented Aug 19, 2024 •

edited

Loading