You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Activation: The RoBERTa uses a GELU activation function. We can implement the GELU using a similar approach as dropout above with no input params. Candle tensors have an inbuilt module to perform this operation
After that it continues to say:
Candle: In candle we can implement the dropout layer by just returning the input tensor
When explaining Activation:
Activation: The RoBERTa uses a GELU activation function. We can implement the GELU using a similar approach as dropout above with no input params. Candle tensors have an inbuilt module to perform this operation
After that it continues to say:
Candle: In candle we can implement the dropout layer by just returning the input tensor
Looks like a typo copied from the previous content. It should probably say in candle we implement the activation layer by calling gelu function
The text was updated successfully, but these errors were encountered: