Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this a typo? #8

Open
johnhuichen opened this issue Jun 29, 2024 · 0 comments
Open

Is this a typo? #8

johnhuichen opened this issue Jun 29, 2024 · 0 comments

Comments

@johnhuichen
Copy link

johnhuichen commented Jun 29, 2024

When explaining Activation:

Activation: The RoBERTa uses a GELU activation function. We can implement the GELU using a similar approach as dropout above with no input params. Candle tensors have an inbuilt module to perform this operation

After that it continues to say:

Candle: In candle we can implement the dropout layer by just returning the input tensor

struct Activation {}

impl Activation {
    fn new() -> Self {
        Self {}
    }

    fn forward(&self, x: &Tensor) -> Result<Tensor> {
        Ok(x.gelu()?)
    }
}

Looks like a typo copied from the previous content. It should probably say in candle we implement the activation layer by calling gelu function

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant