Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add lrp for gpt2 #9

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

add lrp for gpt2 #9

wants to merge 1 commit into from

Conversation

Tomsawyerhu
Copy link

I write a lrp-backward version for gpt2 and provide unit tests. Generally, two functions (conv1d and baddbmm) and one module(Conv1d) are newly added. Also, I modify source code of GPT2, for convenience, dropout layer are ommited.

@rachtibat
Copy link
Owner

Thank you for your nice work. It will take some time for me to digest your code.
We did some experiments with GPT2 (not published) and noticed that it benefits from explaining the softmax classification output with temperature scaling. Not so important now, just wanted to keep this in mind.

@rachtibat
Copy link
Owner

Update: Sorry for the delayed review! Your code looks good! I only think, we could simplify it a lot by using the lxt.rules instead of implementing all functions from scratch. E.g. the Conv1DEpsilon(GPTConv1D) is not really necessary as we can apply the lxt.rules.EpsilonRule on the GPTConv1D layer, because it is a linear operation (like we do for nn.Linear layers for LlaMA).

nn.Linear: rules.EpsilonRule,

I need unfortunately more time to adapt these changes (also writing two follow-up papers right now), but I just wanted to give you a "fast" feedback and I wanted to thank you again for the really nice pull-request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants