add lrp for gpt2 #9

Tomsawyerhu · 2024-07-23T12:57:49Z

I write a lrp-backward version for gpt2 and provide unit tests. Generally, two functions (conv1d and baddbmm) and one module(Conv1d) are newly added. Also, I modify source code of GPT2, for convenience, dropout layer are ommited.

rachtibat · 2024-07-24T14:10:10Z

Thank you for your nice work. It will take some time for me to digest your code.
We did some experiments with GPT2 (not published) and noticed that it benefits from explaining the softmax classification output with temperature scaling. Not so important now, just wanted to keep this in mind.

rachtibat · 2024-11-11T17:10:28Z

Update: Sorry for the delayed review! Your code looks good! I only think, we could simplify it a lot by using the lxt.rules instead of implementing all functions from scratch. E.g. the Conv1DEpsilon(GPTConv1D) is not really necessary as we can apply the lxt.rules.EpsilonRule on the GPTConv1D layer, because it is a linear operation (like we do for nn.Linear layers for LlaMA).

LRP-eXplains-Transformers/lxt/models/llama.py

Line 90 in f802536

nn.Linear: rules.EpsilonRule,

I need unfortunately more time to adapt these changes (also writing two follow-up papers right now), but I just wanted to give you a "fast" feedback and I wanted to thank you again for the really nice pull-request!

add lrp for gpt2

aef6e89

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add lrp for gpt2 #9

add lrp for gpt2 #9

Tomsawyerhu commented Jul 23, 2024

rachtibat commented Jul 24, 2024

rachtibat commented Nov 11, 2024

add lrp for gpt2 #9

Are you sure you want to change the base?

add lrp for gpt2 #9

Conversation

Tomsawyerhu commented Jul 23, 2024

rachtibat commented Jul 24, 2024

rachtibat commented Nov 11, 2024