Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gemma: update activation warning #29995

Merged
merged 2 commits into from
May 1, 2024
Merged

Conversation

pcuenca
Copy link
Member

@pcuenca pcuenca commented Apr 2, 2024

What does this PR do?

This is a nit PR, but I was confused. I got the warning even after I had changed hidden_act to gelu_pytorch_tanh, telling me that I was using the "legacy" gelu_pytorch_tanh.

Another option is to keep the warning but change the message to say something like "hidden_act is ignored, please use hidden_activation instead. Setting Gemma's activation function to gelu_pytorch_tanh".

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.

Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the goal is to have everyone set hidden_activation and not the hidden_act

@pcuenca
Copy link
Member Author

pcuenca commented Apr 3, 2024

Yeah, I get it, but the message I got was:

Gemma's activation function should be approximate GeLU and not exact GeLU.
Changing the activation function to gelu_pytorch_tanh.if you want to use the legacy gelu_pytorch_tanh, edit the model.config to set hidden_activation=gelu_pytorch_tanh instead of hidden_act. See #29402 for more details.

@ArthurZucker
Copy link
Collaborator

ArthurZucker commented Apr 3, 2024

edit the model.config to set hidden_activation=gelu_pytorch_tanh instead of hidden_act.

Sounds alright to me no? Feel free to make it clearer!

@pcuenca
Copy link
Member Author

pcuenca commented Apr 3, 2024

But

if you want to use the legacy gelu_pytorch_tanh, edit the model.config to set hidden_activation=gelu_pytorch_tanh

I'm not using a legacy function, and the warning says it's changing it when it isn't. So my suggestion was to either hide the warning (only in this case, when the activation matches what we want), or if we want to encourage the use of hidden_activation, just say that.

@ArthurZucker
Copy link
Collaborator

You are right, let's encourage the use of hidden_activation . Note that is is changing it:

        if config.hidden_activation is None:
            logger.warning_once(
                "Gemma's activation function should be approximate GeLU and not exact GeLU.\n"
                "Changing the activation function to `gelu_pytorch_tanh`."
                f"if you want to use the legacy `{config.hidden_act}`, "
                f"edit the `model.config` to set `hidden_activation={config.hidden_act}` "
                "  instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details."
            )
            hidden_activation = "gelu_pytorch_tanh"
        else:
            hidden_activation = config.hidden_activation
        self.act_fn = ACT2FN[hidden_activation]

the hidden_activation = "gelu_pytorch_tanh" is used. But it's not changing config.hidden_activation which is also something we can do!

@pcuenca
Copy link
Member Author

pcuenca commented Apr 3, 2024

Yes, hidden_activation = "gelu_pytorch_tanh", I meant that it's no different to what the user expressed with config.hidden_act = "gelu_pytorch_tanh" :) I'll submit a wording change, I didn't think about changing the config too, we can do that if you think it helps :)

@pcuenca pcuenca changed the title Gemma: only display act. warning when necessary Gemma: update activation warning Apr 3, 2024
ArthurZucker
ArthurZucker previously approved these changes Apr 3, 2024
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@ArthurZucker ArthurZucker dismissed their stale review April 3, 2024 09:39

Actually wait

Comment on lines -177 to +183
"Gemma's activation function should be approximate GeLU and not exact GeLU.\n"
"Changing the activation function to `gelu_pytorch_tanh`."
f"if you want to use the legacy `{config.hidden_act}`, "
f"edit the `model.config` to set `hidden_activation={config.hidden_act}` "
" instead of `hidden_act`. See https://github.com/huggingface/transformers/pull/29402 for more details."
"`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.\n"
"Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use\n"
"`config.hidden_activation` if you want to override this behaviour.\n"
"See https://github.com/huggingface/transformers/pull/29402 for more details."
)
hidden_activation = "gelu_pytorch_tanh"
else:
hidden_activation = config.hidden_activation
config.hidden_activation = "gelu_pytorch_tanh"
hidden_activation = config.hidden_activation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the config.hidden_activation is set to None, it has to be set to gelu_pytorch. That is what the warning is saying. We force it.
If the user wants to use another activation, he ought to use config.hidden_activation not config.act_fn. I agree with potentially doing something like config.hidden_activation = "gelu_pytorch_tanh"` but that should be done in the if, not in the else

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow :) This is what the block currently looks like without the diff markup (there's no else any more):

        if config.hidden_activation is None:
            logger.warning_once(
                "`config.hidden_act` is ignored, you should use `config.hidden_activation` instead.\n"
                "Gemma's activation function will be set to `gelu_pytorch_tanh`. Please, use\n"
                "`config.hidden_activation` if you want to override this behaviour.\n"
                "See https://github.com/huggingface/transformers/pull/29402 for more details."
            )
            config.hidden_activation = "gelu_pytorch_tanh"
        hidden_activation = config.hidden_activation

Is that not what we want?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! diff looked wrong!

@pcuenca
Copy link
Member Author

pcuenca commented May 1, 2024

Forgot about this, merging now.

@pcuenca pcuenca merged commit f4f18af into huggingface:main May 1, 2024
18 checks passed
@pcuenca pcuenca deleted the gemma-act-warning branch May 1, 2024 15:23
itazap pushed a commit that referenced this pull request May 14, 2024
* Gemma: only display act. warning when necessary

This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.

Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".

* Change message, and set `config.hidden_activation`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants