-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246
Comments
Hi @nfelnlp, Thanks again for the detailed report! There was indeed a problem with the Side note: In the example above, the fact that This is the code I used to reproduce the issue, which runs correctly in the PR branch above: import inseq
from inseq.data.aggregator import SubwordAggregator
inseq_model = inseq.load_model(
model="gpt2",
attribution_method="attention",
device="cuda"
)
input_text = ("Each 3 items in the following list contains the question, choice and prediction. Your task is to choose "
"one of the choices as the answer for the question.\n"
"Question: 'Sam has been kissing girls at school. Unfortunately, he has to stay home for a week. Why "
"might he need to stay home?'\n"
"Choice: '(1) disease (2) arousal (3) cooties (4) sweet (5) punishment '\n"
"Prediction:")
prediction = "punishment"
# Attribute text
out = inseq_model.attribute(
input_texts=input_text,
generated_texts=f"{input_text} {prediction}",
attribute_target=False,
step_scores=["probability"],
show_progress=True,
generation_args={}
)
# Standard aggregation
out_agg = out.aggregate()
# Get HTML visualization
out_viz = out_agg.show(return_html=True, do_aggregation=False)
# This works perfectly fine and puts out a 1D vector of attribution scores for the entire prompt up until the final generated token
# First we aggregate the subword tokens. Special symbol default is ▁ (SentencePiece), we use Ġ here (GPT-2)
# The second aggregate call is exactly like the one above: for attention, [mean, mean] (mean across the layers and heads dimensions)
subw_sqa_agg = out.aggregate(SubwordAggregator, special_chars=("Ġ", "Ċ")).aggregate()
subw_viz = subw_sqa_agg.show(return_html=True, do_aggregation=False) |
🐛 Bug Report
Hi @gsarti,
I've discovered an interesting behavior of the aggregators. Thanks for the help on that so far via the private chat! 😄
Since this is a bit trickier than anticipated, posting this here makes sense.
Specifically, I'm working with a quantized Llama-2-7b (GPTQ) and a relatively long prompt for the ECQA task. I've simply copied the
input_text
of one instance and the model'sprediction
into the example code below.The resulting attribution (
out
) will have a 4Dtarget_attributions
tensor of shape (102, 1, 32, 32).Producing a single vector of attribution scores (102, 1) works well with the standard
SequenceAttributionAggregator
.out_viz
produces the following:[ ... ]
However, when I want to apply the
SubwordAggregator
, I first got a 3D matrix of shape (66, 32, 32).Following your suggestion of adding the
SequenceAttributionAggregator
to the pipeline, I will get a 2D attribution matrix (see screenshot). It's not clear to me yet what how the columns are supposed to be read, since the first column appears to be an artifact from before the subwords were aggregated.Is the shape of target_attributions here (66 x 32) correct according to your interpretation?
How would I now determine the aggregated importance score of each input token? I can't apply another
.aggregate()
, right?Also while I'm at it, I forgot how I can remove BOS (
"<s>"
) from the resulting matrix. I only foundskip_special_tokens
andclean_tokens
fromHuggingfaceModel
, but this can't be applied here, I think.Thanks so much for your help! 🙌
Code sample
Environment
inseq @ git+https://github.com/inseq-team/inseq.git@dfea66fc02b65ef336cdf63826ddb1b439a90786
)The text was updated successfully, but these errors were encountered: