-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GELU does not appear to support approximate tanh #1368
Comments
I'm guessing perhaps this could be an option
and then perhaps replace the current GELU calling function, or add an overload (either way seems similar)
|
Two options:
|
Sorry if this seems obvious, just trying to make sure it's right. I'm definitely willing to try the PR approach for this (and anything else I could help with).
Would then enum reside within the same GELU.cs file? Perhaps the changes could look like: PInvoke change:
GELU.cs change: (within the Modules namespace)
the updated constructor:
|
I tried the previous code, but it causes an exception when calling the ctor. If I use string instead of the enum it works, so perhaps the implicit conversion of ApproxType.tanh to 1 is causing the problem. Unsure how or where the enum would be brought back to a string to satisfy the Perhaps a blend of the two?
|
The optional algorithm for GELU is to internally use tanh
See more here:
https://pytorch.org/docs/stable/generated/torch.nn.GELU.html#torch.nn.GELU
I was expecting this to just work:
var gelu = nn.GELU(approximate: "tanh");
When the approximate argument is ‘tanh’, GELU is estimated differently. The default is rather different.
Is it possible, since this is supported natively, to include the "approximate" property for TorchSharp's GELU?
Is there a way for me to do it without requiring the difficulty of pushing new versions of the library?
The text was updated successfully, but these errors were encountered: