-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuning doesn't initialize microsoft/resnet classifier weights with _fast_init #31841
Comments
Hi @williford Could you share your system info with us? You can run the command |
For the reproduction I installed
|
@ydshieh If I'm understanding the code correctly, your change makes sure the When fast_init is not set, then the Linear module initializes the weights via the "reset_parameters" method. |
@williford Thank you for diving into this issue. Yes, you are correct! I opened a PR to fix it and it works now. |
System Info
It seems that the changes with #11471 broke fine-tuning of ResNet
(when the number of classes is being changed).
It seems like most models handle this by adding Linear to the following:
transformers/src/transformers/models/resnet/modeling_resnet.py
Line 274 in ae9dd02
However, it seems like it would be better to handle it when the mismatch size is detected in modeling_utils.py:
transformers/src/transformers/modeling_utils.py
Line 4282 in ae9dd02
Who can help?
@amyeroberts
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
E.g.
Disabling the _fast_init fixes the issue:
Expected behavior
The statistics of the initialized weights should be similar with and without the _fast_init - importantly, it shouldn't contain NaN's and the maximum absolute values shouldn't be 0 or really large (e.g. > 1e20).
The text was updated successfully, but these errors were encountered: