Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246

nfelnlp · 2024-01-11T11:36:50Z

🐛 Bug Report

I've discovered an interesting behavior of the aggregators. Thanks for the help on that so far via the private chat! 😄
Since this is a bit trickier than anticipated, posting this here makes sense.

Specifically, I'm working with a quantized Llama-2-7b (GPTQ) and a relatively long prompt for the ECQA task. I've simply copied the input_text of one instance and the model's prediction into the example code below.

The resulting attribution (out) will have a 4D target_attributions tensor of shape (102, 1, 32, 32).
Producing a single vector of attribution scores (102, 1) works well with the standard SequenceAttributionAggregator.
out_viz produces the following:

[ ... ]

However, when I want to apply the SubwordAggregator, I first got a 3D matrix of shape (66, 32, 32).
Following your suggestion of adding the SequenceAttributionAggregator to the pipeline, I will get a 2D attribution matrix (see screenshot). It's not clear to me yet what how the columns are supposed to be read, since the first column appears to be an artifact from before the subwords were aggregated.
Is the shape of target_attributions here (66 x 32) correct according to your interpretation?

How would I now determine the aggregated importance score of each input token? I can't apply another .aggregate(), right?

Also while I'm at it, I forgot how I can remove BOS ("<s>") from the resulting matrix. I only found skip_special_tokens and clean_tokens from HuggingfaceModel, but this can't be applied here, I think.

Thanks so much for your help! 🙌

Code sample

import inseq
from inseq.data.aggregator import SequenceAttributionAggregator, SubwordAggregator
from transformers import AutoModelForCausalLM, AutoTokenizer, GPTQConfig


model_name = "TheBloke/Llama-2-7b-Chat-GPTQ"
gpt_tokenizer = AutoTokenizer.from_pretrained(model_name)
quantization_config = GPTQConfig(bits=4, tokenizer=gpt_tokenizer)
gpt_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=quantization_config,
    device_map="auto"
)
inseq_model = inseq.load_model(
    model=gpt_model,
    attribution_method="attention",
    device="cuda"
)

input_text = ("Each 3 items in the following list contains the question, choice and prediction. Your task is to choose "
              "one of the choices as the answer for the question.\n"
              "Question: 'Sam has been kissing girls at school.  Unfortunately, he has to stay home for a week. Why "
              "might he need to stay home?'\n"
              "Choice: '(1) disease (2) arousal (3) cooties (4) sweet (5) punishment '\n"
              "Prediction: ")
prediction = "punishment"

# Attribute text
out = inseq_model.attribute(
    input_texts=input_text,
    generated_texts=f"{input_text}{prediction}",
    n_steps=1,
    attribute_target=False,
    step_scores=["probability"],
    show_progress=True,
    generation_args={}
)

# Standard aggregation
out_agg = out.aggregate()
# Get HTML visualization
out_viz = out_agg.show(return_html=True, do_aggregation=False)
# This works perfectly fine and puts out a 1D vector of attribution scores for the entire prompt up until the final generated token

# TODO: Perform subword aggregation
subw_sqa_agg = out.aggregate([SubwordAggregator, SequenceAttributionAggregator])
subw_viz = subw_sqa_agg.show(return_html=True, do_aggregation=False)
# This produces the heatmap as shown in the final screenshot

Environment

OS: Ubuntu
Python version: 3.10
Inseq version: 0.5.0 (inseq @ git+https://github.com/inseq-team/inseq.git@dfea66fc02b65ef336cdf63826ddb1b439a90786)

The text was updated successfully, but these errors were encountered:

gsarti · 2024-01-12T16:22:37Z

Hi @nfelnlp,

Thanks again for the detailed report! There was indeed a problem with the aggregate_contiguous function in the edge case where the attribution tensor had a single dimension, which was getting squeezed out, causing a shape mismatch in the shape compatibility check of the Aggregator. It will be fixed in #247.

Side note: In the example above, the fact that input_text ends with a whitespace is the culprit to ▁pun not being correctly integrated in the generated output, since the ▁ character used for aggregation is included as final character of input_text. This in order causes the next token to be tokenized weirdly (e.g. pun ishment instead of ▁punishment). I suggest to remove the space there, and add it to the template as generated_texts=f"{input_text}{prediction}" in the attribute call.

This is the code I used to reproduce the issue, which runs correctly in the PR branch above:

import inseq
from inseq.data.aggregator import SubwordAggregator

inseq_model = inseq.load_model(
    model="gpt2",
    attribution_method="attention",
    device="cuda"
)

input_text = ("Each 3 items in the following list contains the question, choice and prediction. Your task is to choose "
              "one of the choices as the answer for the question.\n"
              "Question: 'Sam has been kissing girls at school.  Unfortunately, he has to stay home for a week. Why "
              "might he need to stay home?'\n"
              "Choice: '(1) disease (2) arousal (3) cooties (4) sweet (5) punishment '\n"
              "Prediction:")
prediction = "punishment"

# Attribute text
out = inseq_model.attribute(
    input_texts=input_text,
    generated_texts=f"{input_text} {prediction}",
    attribute_target=False,
    step_scores=["probability"],
    show_progress=True,
    generation_args={}
)

# Standard aggregation
out_agg = out.aggregate()
# Get HTML visualization
out_viz = out_agg.show(return_html=True, do_aggregation=False)
# This works perfectly fine and puts out a 1D vector of attribution scores for the entire prompt up until the final generated token

# First we aggregate the subword tokens. Special symbol default is ▁ (SentencePiece), we use Ġ here (GPT-2)
# The second aggregate call is exactly like the one above: for attention, [mean, mean] (mean across the layers and heads dimensions)
subw_sqa_agg = out.aggregate(SubwordAggregator, special_chars=("Ġ", "Ċ")).aggregate()
subw_viz = subw_sqa_agg.show(return_html=True, do_aggregation=False)

nfelnlp added the bug Something isn't working label Jan 11, 2024

nfelnlp mentioned this issue Jan 11, 2024

inseq assertion error DFKI-NLP/LLMCheckup#21

Closed

gsarti mentioned this issue Jan 12, 2024

Fix aggregate_contiguous #247

Merged

gsarti linked a pull request Jan 12, 2024 that will close this issue

Fix aggregate_contiguous #247

Merged

gsarti closed this as completed in #247 Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246

Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246

nfelnlp commented Jan 11, 2024

gsarti commented Jan 12, 2024

Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246

Producing heatmap visualizations with SubwordAggregator and SequenceAttributionAggregator #246

Comments

nfelnlp commented Jan 11, 2024

🐛 Bug Report

Code sample

Environment

gsarti commented Jan 12, 2024